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ABSTRACT 


Title  of  Dissertation:  Approximate  Nonlinear  Filtering  with  Applications 

to  Navigation 


Babak  Azimi-Sadjadi,  Doctor  of  Philosophy,  2001 


Dissertation  directed  by:  Professor  P.  S.  Krishnaprasad 

Department  of  Electrical  and  Computer  Engineering 


In  this  dissertation  we  address  nonlinear  techniques  in  filtering,  estimation, 
and  detection  that  arise  in  satellite  based  navigation.  Here,  we  emphasize  the 
theoretical  aspect  of  these  techniques,  and  we  also  address  their  applications. 

We  first  introduce  particle  filtering  for  an  exponential  family  of  densities.  We 
prove  that  under  certain  conditions  the  approximated  conditional  density  con¬ 
verges  to  the  true  conditional  density.  For  the  case  where  the  conditional  density 
does  not  lie  in  an  exponential  family  but  stays  close  to  it,  we  show  that  under 
certain  assumptions  the  error  of  the  estimate  given  by  this  approximate  nonlin¬ 
ear  filtering,  projection  particle  filtering ,  is  bounded.  We  give  similar  results  for  a 
family  of  mixture  densities.  We  use  projection  particle  filtering  for  an  exponential 
family  of  densities  to  estimate  the  position  of  a  mobile  platform  that  has  a  com¬ 
bination  of  inertial  navigation  system  (INS)  and  global  positioning  system  (GPS), 


referred  to  as  an  integrated  INS/GPS.  We  show  via  numerical  experiments  that 
projection  particle  filtering  exceeds  regular  particle  filtering  methods  in  navigation 
performance. 

Using  carrier  phase  measurements  enables  the  differential  GPS  to  reach  cen¬ 
timeter  level  accuracy.  The  phase  lock  loop  of  a  GPS  receiver  cannot  measure  the 
full  cycle  part  of  the  carrier  phase.  This  unmeasured  part  is  called  integer  ambigu¬ 
ity ,  and  it  should  be  resolved  through  other  means.  Here,  we  present  a  new  integer 
ambiguity  resolution  method.  In  this  method  we  treat  the  integer  ambiguity  as 
a  random  digital  vector.  Using  particle  filtering,  we  approximate  the  conditional 
probability  mass  function  of  the  integer  ambiguity  given  the  observation.  The 
resolved  integer  is  the  MAP  estimate  of  the  integer  given  the  observation. 

Reliability  of  a  positioning  system  is  of  great  importance  for  navigation  pur¬ 
poses.  Therefore,  an  integrity  monitoring  system  is  an  inseparable  part  of  any 
navigation  system.  Failures  or  changes  due  to  malfunctions  in  sensors  and  actu¬ 
ators  should  be  detected  and  repaired  to  keep  the  integrity  of  the  system  intact. 
Since  in  most  practical  applications,  sensors  and  actuators  have  nonlinear  dynam¬ 
ics,  this  nonlinearity  should  be  reflected  in  the  corresponding  change  detection 
methods.  In  this  dissertation  we  present  a  change  detection  method  for  nonlin¬ 
ear  stochastic  systems  based  on  projection  particle  filtering.  The  statistic  for  this 
method  is  chosen  in  such  a  way  that  it  can  be  calculated  recursively,  while  the  com¬ 
putational  complexity  of  the  method  remains  constant  with  respect  to  time.  We 
present  some  simulation  results  that  show  the  advantages  of  this  method  compared 
to  linearization  techniques. 
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Chapter  1 


Introduction 


Position  estimation  is  one  of  the  key  issues  in  the  automated  control  of  a  vehicle. 
Positioning  methods  can  be  classified  into  three  different  groups.  Dead  reckoning  is 
based  on  piece-wise  integration  of  the  speed  and  heading  of  the  vehicle  to  calculate 
the  position  with  respect  to  a  known  starting  point.  Clearly,  the  estimate  of  the 
position  gets  worse  as  time  goes  on,  i.e.  the  error  in  the  estimation  accumulates. 
The  Inertial  Navigation  System  (INS)  type  of  positioning  is  based  on  Newton’s 
second  law.  In  the  INS,  the  system  uses  the  acceleration  and  its  direction  to  find 
(again  using  integration)  the  current  location.  This  method  is  more  accurate  than 
the  dead-reckoning  method,  and  it  can  be  used  in  almost  all  applications.  Since  the 
calculation  of  the  current  position  is  based  on  the  integration  of  the  instantaneous 
acceleration,  this  method  suffers  from  the  same  deficiencies  as  the  dead-reckoning 
method  does.  The  third  type  of  positioning  is  based  on  measuring  the  distance  of 
the  unknown  location  from  several  known  positions.  The  accuracy  of  this  method 
depends  on  the  accuracy  of  the  measurement  and  the  accuracy  of  our  knowledge  of 
the  location  of  the  known  points.  Unlike  the  INS  and  the  dead-reckoning  methods, 
this  method  does  not  suffer  from  the  accumulation  of  error  over  the  duration  of 
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the  measurement.  In  fact,  the  longer  the  measurement  takes  the  more  accurate 
the  estimate  becomes.  The  Global  Positioning  System  (GPS)  measurement  is  one 
of  the  third  positioning  types. 

GPS  was  conceived  as  a  ranging  system  from  known  positions  of  satellites  in 
space  to  unknown  positions  on  land,  in  sea,  in  air  and  in  space.  Effectively,  the 
satellite  signal  is  continuously  marked  with  its  own  transmission  time  so  that,  when 
received,  the  signal  travel  time  can  be  measured  by  a  synchronized  receiver.  Apart 
from  point  positioning,  the  determination  of  a  vehicle’s  instantaneous  position  and 
velocity,  and  precise  coordination  of  time  were  original  objectives  of  GPS  [28]. 

GPS  uses  “pseudoranges”  derived  from  the  broadcast  satellite  signal.  The  pseu¬ 
dorange  is  derived  either  from  measuring  the  travel  time  of  the  coded  signal  and 
multiplying  it  by  its  velocity  or  by  measuring  the  phase  of  the  signal.  In  both 
cases,  the  clocks  of  the  receiver  and  the  satellite  are  employed.  Since  these  clocks 
are  never  perfectly  synchronized,  instead  of  true  ranges  pseudoranges  are  obtained 
where  the  synchronization  error  (denoted  as  clock  error)  is  taken  into  account. 
Consequently,  each  equation  of  this  type  comprises  four  unknowns:  the  desired 
three  point  coordinates  contained  in  true  range,  and  the  clock  error.  Thus,  infor¬ 
mation  from  (at  least)  four  satellites  is  necessary  to  solve  for  the  four  unknowns. 

Differential  GPS  allows  the  user  to  obtain  a  more  accurate  measurement.  It, 
in  fact,  allows  the  removal  of  a  good  portion  of  the  positioning  error  from  the 
estimation.  This,  along  with  other  new  technology,  allows  the  users  to  use  the 
carrier  phase  as  part  of  the  positioning  information.  This  can  increase  the  accuracy 
of  the  estimation  to  centimeter,  or  in  the  static  case,  to  millimeter  levels. 

Unlike  the  applications  in  communication,  in  positioning  one  needs  to  know 
the  exact  phase  difference  between  the  received  signal  and  the  transmitted  signal, 
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i.e.  the  exact  number  of  full  cycles  and  the  portion  of  the  phase  that  is  less 
than  a  full  cycle  is  needed  for  position  estimation.  A  Phase  Lock  Loop  (PLL) 
can  provide  a  very  accurate  estimate  of  the  portion  of  the  phase  that  is  less  than 
a  full  cycle.  Also,  it  can  count  the  number  of  full  cycles  added  to  the  phase 
once  it  starts  tracking  the  signal  continuously.  The  initial  value  of  the  full  cycles, 
though  is  not  known,  therefore,  the  phase  lock  loop  cannot  provide  that  part  of  the 
phase  information.  This  part,  which  is  a  constant  integer  number  of  full  cycles, 
is  called  integer  ambiguity  and  should  be  resolved  through  numerical  methods 
[26,  28,  51,  52], 

Although  carrier  phase  differential  GPS  allows  for  very  accurate  positioning, 
it  is  very  sensitive  to  obstacles  that  can  block  satellite  signals.  The  loss  in  signal 
could  be  for  a  few  moments  or  for  a  longer  period  of  time.  If  the  loss  in  signal 
is  sufficiently  short,  the  phase  lock  loop  is  unable  to  detect  the  loss  in  signal, 
therefore  it  is  not  able  to  record  the  added  full  cycles  to  the  measured  phase.  This 
results  in  a  jump  in  the  measured  phase.  This  phenomena  is  known  as  cycle  slip. 
Any  navigation  method  that  uses  carrier  phase  differential  GPS  should  be  able  to 
detect  and  isolate  the  cycle  slips  whenever  they  occur. 

If  the  loss  in  signal  is  for  a  sufficiently  long  period  of  time,  so  that  position 
information  is  needed  while  the  GPS  receiver  has  signals  only  from  three  or  fewer 
satellites,  the  positioning  techniques  that  are  solely  based  on  satellite  navigation 
fail  to  function.  In  such  cases  the  user  should  use  other  methods  to  be  able  to 
receive  continuous  position  information  [1,  16,  19,  20,  42,  56]. 

Integration  of  INS  with  GPS  has  proven  to  be  robust  and  accurate.  In  an 
integrated  INS/GPS,  INS  provides  positioning  information  that  is  calibrated  by 
GPS. 
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In  most  applications,  such  as  integrated  INS/GPS,  or  dead- reckoning/ GPS,  or 
vehicle  dynamic/GPS,  linearization  of  the  dynamic  and  the  GPS  observation  is 
the  main  tool  for  estimation  [20,  21,  42,  44,  45].  It  can  be  shown  [16]  that  when 
the  number  of  satellites  is  below  a  certain  number  or  the  geometry  of  the  satellite 
constellation  is  near  singular,  the  Extended  Kalman  Filter  (EKF)  diverges  and 
fails  to  provide  accurate  estimation  of  the  position.  In  this  case,  it  is  important  to 
use  nonlinear  filtering  for  the  estimation  problem. 

The  results  in  [16]  were  a  motivation  for  us  to  study  nonlinear  filtering,  estima¬ 
tion,  and  detection  methods  and  their  applications  to  satellite  based  navigation. 
In  this  dissertation  we  are  interested  both  in  the  theory  and  the  application  of 
such  methods.  For  this  reason  our  intention  is  to  discover  new  tractable  finite  ap¬ 
proximation  methods  for  nonlinear  filtering  problems.  We  are  specially  interested 
in  the  approximation  methods  that  are  suited  for  satellite  based  navigation. 

Our  contribution  in  this  dissertation  can  be  categorized  into  three  major  areas. 
In  all  of  these  three  areas  we  have  developed  the  theory  of  the  proposed  approx¬ 
imation  methods  as  well  as  the  relevant  application  to  navigation.  In  the  rest  of 
this  chapter  we  introduce  these  three  areas. 

1.1  Approximate  Nonlinear  Filtering 

Unlike  the  linear  Gaussian  case,  no  finite  dimensional  filtering  method  for  gen¬ 
eral  nonlinear  systems  exists.  The  most  well  known  approximation  method  for 
nonlinear  filters,  Extended  Kalman  Filtering  (EKF),  is  merely  an  ad  hoc  method 
[46].  The  performance  of  EKF  depends  on  the  specific  application  and  it  is  not 
guaranteed. 
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Projection  filtering  is  another  approximation  method  for  nonlinear  filtering 
[9,  11,  12].  The  main  assumption  in  projection  filtering  is  that  the  conditional 
density  of  the  state  given  the  observations  can  be  projected  onto  a  family  of  den¬ 
sities  without  significant  error.  In  [11]  the  conditional  density  is  projected  onto  an 
exponential  family  of  densities.  Since  the  exponential  family  has  a  finite  dimen¬ 
sional  parametric  representation,  the  projected  nonlinear  filter  also  has  a  finite 
dimensional  form. 

In  a  different  approach  [9],  the  conditional  density  of  the  system  given  the 
observations  is  approximated  by  a  summation  of  basis  functions.  Then,  a  Galerkin 
approximation  method  is  used  to  propagate  the  coefficients  of  the  approximated 
density. 

Although  both  methods  in  [9]  and  [11]  provide  better  approximation  methods 
than  EKF,  the  convergence  of  the  approximated  conditional  density  to  the  actual 
conditional  density  is  not  studied  1 . 

An  entirely  different  approach  for  approximating  the  conditional  density  is 
simulation  based  filtering.  Grid-less  simulation  based  filtering,  now  known  by  many 
different  names  such  as  particle  filtering  [34,  40],  the  Condensation  Algorithm  [29], 
the  Sequential  Monte  Carlo  (SMC)  Method  [22],  and  Bayesian  Bootstrap  Filtering 
[24],  was  first  introduced  in  [24]  and  then  it  was  rediscovered  independently  in  [29] 
and  [32],  Henceforth  we  refer  to  this  filtering  method  as  particle  filtering.  The 
results  in  [24]  are  the  extension  of  the  results  in  [48]  and  [2]  to  the  dynamic  case 
and  is  based  on  a  method  called  Sampling/Importance  Resampling  (SIR).  SIR  is 
key  element  of  the  grid-less  simulation  based  filtering  methods.  SIR  allows  these 

Hn  [9]  a  convergence  proof  is  reported  but  in  a  remark  the  authors  note  that:  “The  requirement 
in  the  hypothesis  of  Theorem  1  is  somewhat  unsatisfactory  because  it  is  not  clear  at  this  stage 
how  to  guarantee  that  this  is  true  a  priori” . 
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methods  to  have  automatically  high  resolution  grids  in  areas  where  the  conditional 
density  is  significant  and  low  resolution  in  the  areas  where  the  conditional  density 

is  small. 

Particle  filtering  is  a  Monte  Carlo  based  method  for  nonlinear  filtering.  The 
particles  in  this  method  refer  to  independent  samples  generated  with  the  Monte 
Carlo  method.  In  [40]  it  was  shown  that  the  optimal  nonlinear  filter  can  be  approx¬ 
imated  with  an  arbitrarily  small  error  by  a  finite  dimensional  filter.  The  problem 
of  this  method  is  that  for  high  dimensional  systems,  and  for  small  errors,  com¬ 
putational  complexity  grows,  and  the  method  is  not  always  implementable  in  real 
time  applications.  The  other  shortcoming  of  particle  filtering  is  its  vulnerability  to 
sample  impoverishment  [15],  so  that  the  particle  distribution  gives  a  poor  approx¬ 
imation  of  the  required  conditional  density.  In  extreme  cases,  after  a  sequence  of 
updates  the  particle  system  can  collapse  to  a  single  point.  In  less  extreme  cases, 
although  several  particles  may  survive,  there  is  so  much  internal  correlation  that 
summary  statistics  behave  as  if  they  are  derived  from  a  substantially  smaller  sam¬ 
ple.  To  compensate,  large  numbers  of  particles  are  required  in  realistic  problems 

[15]. 

In  the  cases  where  we  have  some  prior  information  about  the  distribution, 
we  should  expect  to  achieve  higher  performance  if  we  take  this  information  into 
account.  By  higher  performance,  we  mean  a  reduction  in  the  computational  cost 
and  an  increase  in  the  convergence  rate.  Here  we  assume  that  the  conditional 
distribution  has  a  density  in  an  exponential  family  of  densities,  or  at  least  stays 
close  to  it  in  a  sense  that  we  will  define.  Using  this  assumption,  we  replace  the 
empirical  distribution  in  [40]  with  the  Maximum  Likelihood  Estimate  (MLE)  of  the 
parameters  of  an  exponential  density.  We  call  this  new  method  projection  particle 
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filtering.  In  Theorem  4.1.6  we  show  that  if  the  conditional  density  of  the  state 
given  the  observations  lies  in  an  exponential  family  of  densities  then  the  estimated 
conditional  density  converges  to  the  true  conditional  density  in  a  sense  that  will 
be  defined.  In  Theorem  4.2.7  for  the  case  where  the  true  conditional  density  stays 
close  to  an  exponential  family  of  densities  we  show  that  the  error  of  the  estimate 
given  by  projection  particle  filtering  is  bounded. 

As  stated  in  [11],  finding  the  proper  exponential  family  of  densities  for  a  dy¬ 
namical  system  is  quite  challenging.  To  overcome  this  problem  and  motivated  by 
the  results  in  Theorems  4.1.6  and  4.2.7,  we  studied  projection  filtering  for  a  family 
of  mixture  densities.  In  this  case,  we  also  show  that  if  the  family  of  mixture  den¬ 
sities  is  close  (in  a  sense  that  will  be  defined  later)  to  the  true  conditional  density, 
the  error  of  estimate  given  by  approximate  filtering  is  bounded. 

One  of  the  applications  of  projection  particle  filtering  is  position  estimation  for 
an  integrated  INS/GPS.  We  are  particularly  interested  in  the  cases  where  lineariza¬ 
tion  methods  fail.  One  such  case  is  when  the  number  of  GPS  satellites  in  view 
is  below  a  critical  number  (for  three  dimensional  positioning,  this  critical  number 
is  four).  We  demonstrate  numerically  that  in  this  situation  the  position  estima¬ 
tion  given  by  the  EKF  diverges,  while  the  approximate  nonlinear  filtering  methods 
provide  a  reasonable  estimate  of  the  position.  We  also  show  via  numerical  results 
that  the  performance  of  the  projection  particle  filter  exceeds  the  performance  of 
the  particle  filter  for  the  same  number  of  particles. 

1.2  Integer  Ambiguity  Resolution 

Integer  ambiguity  resolution  methods  are  an  inseparable  part  of  positioning  tech¬ 
niques  that  use  carrier  phase  differential  GPS  as  part  of  their  measurement. 
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The  available  integer  ambiguity  resolution  methods  are  mostly  based  on  a  rough 
estimate  of  the  integer  ambiguity  and  a  search  method  to  find  the  correct  integer 
value  [3].  In  the  case  of  the  Ambiguity  Function  Method  (AFM),  since  the  integer 
ambiguity  cancels  out  as  a  result  of  the  cosine  function  in  the  ambiguity  function, 
the  search  is  done  over  the  position  grid  [38,  47].  In  the  least  square  ambiguity 
search  technique,  first  the  float  solution  for  integer  ambiguity  is  found  by  mini¬ 
mizing  the  square  of  the  error  associated  to  the  position  and  the  integer  estimate. 
If  the  covariance  matrix  of  the  error  for  these  estimates  of  the  integer  ambiguity 
is  diagonal,  the  best  integer  vector  that  minimizes  the  error  is  the  nearest  integer 
vector,  but  usually  this  is  not  the  case.  Therefore,  the  correct  solution  is  found 
by  searching  the  area  near  the  float  solution  [27].  The  size  of  this  area  depends 
on  the  covariance  matrix  and  the  size  of  the  integer  vector,  i.e.  the  number  of 
satellites.  In  the  Least-squares  AMBiguity  Decorrelation  Approach  (LAMBDA),  a 
linear  transformation  of  the  GPS  observables  that  maps  integer  vectors  to  integer 
vectors,  is  chosen  in  such  a  way  that  the  transformed  covariance  matrix  is  dom¬ 
inantly  diagonal  [52],  This  transformation  helps  to  reduce  the  size  of  the  search 
space.  Variations  of  these  methods  have  been  used.  For  example,  in  [26]  a  Kalman 
filter  is  used  to  estimate  the  float  least  square  estimation  of  the  integer  ambiguity 
and  the  same  type  of  decorrelation  is  applied  to  the  observable  to  reduce  the  size 
of  the  search  space. 

In  most  of  these  methods  the  integer  ambiguity  is  treated  as  an  unknown  in¬ 
teger  vector.  In  this  dissertation  we  present  a  new  method  that  treats  the  integer 
ambiguity  as  a  random  integer  vector.  Inspired  by  our  results  in  Theorems  4.1.6 
and  4.2.7,  we  present  a  method  for  approximating  the  conditional  probability  mass 
function  (pmf)  of  this  integer  vector  given  the  observations.  The  estimate  of  the 


integer  value  then  is  simply  the  point  that  maximizes  the  pmf.  In  this  method, 
similar  to  the  projection  particle  filtering  method,  we  find  a  family  of  exponential 
distributions  that  is  close  to  the  true  pmf.  The  integer  ambiguity  is  then  resolved 
through  the  estimation  of  the  parameter  of  the  family. 

1.3  Detection  of  Abrupt  Changes  in  Nonlinear 
Dynamical  Systems 

In  [43]  the  change  detection  problem  is  stated  as  follows: 

“Whenever  observations  are  taken  in  order  it  can  happen  that  the  whole  set  of 
observations  can  be  divided  into  subsets,  each  of  which  can  be  regarded  as  a  random 
sample  from  a  common  distribution,  each  subset  corresponding  to  a  different  pa¬ 
rameter  value  of  the  distribution.  The  problems  to  be  considered  in  this  paper  are 
concerned  with  the  identification  of  the  subsamples  and  the  detection  of  changes  in 
the  parameter  value”. 

We  refer  to  a  change  or  an  abrupt  change  as  any  change  in  the  parameters  of 
the  system  that  happens  either  instantaneously,  or  much  faster  than  any  change 
that  the  nominal  bandwidth  of  the  system  allows. 

The  key  difficulty  of  all  change  detection  methods  is  that  of  detecting  intrinsic 
changes  that  are  not  necessarily  directly  observed  but  are  measured  together  with 
other  types  of  perturbations  [8]. 

The  change  detection  could  be  off-line  or  on-line.  In  on-line  change  detection, 
we  are  only  interested  in  detecting  the  fact  that  a  change  happened.  In  this  case, 
we  are  only  interested  in  detecting  the  change  as  quickly  as  possible  (for  example, 
to  minimize  the  detection  delay  with  fixed  mean  time  between  false  alarms),  and 
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the  estimate  of  the  time  when  the  change  occurs  is  not  of  importance.  In  off-line 
change  detection,  we  assume  that  the  whole  observation  sequence  is  available  at 
once.  In  this  case,  the  estimate  of  the  time  of  change  could  be  one  of  the  goals  of 
the  detection  method.  In  this  dissertation  we  limit  our  concern  to  on-line  detection 
of  abrupt  changes. 

The  change  detection  methods  that  are  studied  in  this  dissertation  can  be  clas¬ 
sified  under  the  general  name  of  Likelihood  Ratio  methods.  Cumulative  SUM 
(CUSUM)  and  Generalized  Likelihood  Ratio  (GLR)  tests  are  among  these  meth¬ 
ods.  CLTSLIM  was  first  proposed  in  [43].  The  most  basic  CUSUM  algorithm  as¬ 
sumes  that  the  observation  signal  is  a  sequence  of  stochastic  variables  which  are 
independent  and  identically  distributed  with  known  common  probability  density 
function  before  the  change  time,  and  independent  and  identically  distributed  with 
another  known  probability  density  after  the  change  time.  In  the  CUSUM  algorithm 
the  log-likelihood  ratio  for  the  observation  from  time  i  to  time  k  is  calculated  and 
its  difference  with  its  current  minimum  is  compared  with  a  certain  threshold.  If 
this  difference  exceeds  the  threshold  an  alarm  is  issued. 

Properties  of  the  CUSUM  algorithm  have  been  studied  extensively.  The  most 
important  property  of  the  CUSUM  algorithm  is  its  asymptotic  optimality,  which 
was  first  proven  in  [37].  More  precisely,  CUSUM  is  optimal,  with  respect  to  the 
worst  mean  delay,  when  the  mean  time  between  false  alarms  goes  to  infinity.  This 
asymptotic  point  of  view  is  convenient  in  practice,  because  a  low  rate  of  false 
alarms  is  always  desirable. 

In  the  case  of  unknown  system  parameters  after  change,  the  GLR  algorithm 
can  be  used  as  a  generalization  of  the  CUSUM  algorithm.  Since  in  this  algorithm 
the  exact  information  of  the  change  pattern  is  not  known,  the  likelihood  ratio  is 
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maximized  over  all  possible  change  patterns  2 . 

For  stochastic  systems  with  linear  dynamics  and  linear  observations,  the  ob¬ 
servation  sequence  is  not  independent  and  identically  distributed.  Therefore,  the 
regular  CUSUM  algorithm  cannot  be  applied  for  detection  of  changes  in  such 
systems.  However,  if  such  systems  are  driven  by  Gaussian  noise,  the  innovation 
process  associated  with  the  system  can  be  generated.  This  process  is  known  to  be 
a  sequence  of  independent  random  variables.  The  regular  CUSUM  algorithm  or  its 
more  general  counterpart,  GLR,  can  be  applied  to  this  innovation  process  [8,  55]. 

In  this  dissertation  we  are  interested  in  the  change  detection  problem  for 
stochastic  systems  with  nonlinear  dynamics  and  observations.  We  show  that  for 
such  systems,  the  complexity  of  the  CUSUM  algorithm  grows  with  respect  to  time. 
This  growth  in  complexity  cannot  be  tolerated  in  practical  problems.  Therefore, 
instead  of  the  statistic  used  in  the  CUSUM  algorithm  we  introduce  an  alternative 
statistic.  We  show  that  with  this  statistic,  the  calculation  of  the  likelihood  ra¬ 
tio  can  be  done  recursively  and  the  computational  complexity  of  the  method  stays 
constant  with  respect  to  time.  This  new  method  is  used  for  the  cycle  slip  detection 
for  an  integrated  INS /GPS. 

1.4  Dissertation  Outline 

In  Chapter  2  we  briefly  review  the  GPS  signal  structure  and  explain  different 
GPS  observables.  Chapter  3  is  devoted  to  the  review  of  different  approximate 
nonlinear  filtering  methods  as  well  as  a  statement  of  the  general  nonlinear  filtering 
framework.  In  Chapter  4  we  present  our  main  results  on  projection  particle  filtering 
for  an  exponential  family  of  densities.  Chapter  5  addresses  the  applications  of 
2If  the  maximum  does  not  exist,  the  supremum  of  the  likelihood  ratio  should  be  calculated. 
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the  results  in  Chapter  4  to  position  estimation  for  an  integrated  INS/GPS  under 
critical  conditions.  In  Chapter  6  we  present  onr  results  on  projection  particle 
filtering  on  a  family  of  mixture  densities.  We  introduce  our  new  integer  ambiguity 
resolution  method  based  on  projection  particle  filtering  in  Chapter  7.  In  Chapter  8 
we  present  our  results  in  change  detection  for  nonlinear  systems  and  its  application 
to  cycle  slip  detection  for  an  integrated  INS/GPS.  Finally,  in  Chapter  9  we  state 
conclusions  and  an  outline  of  future  work. 
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Chapter  2 


A  Short  Review  of  GPS 


The  NAVSTAR  (Navigation  Satellite  Timing  and  Ranging)  GPS  is  a  satellite 
based,  worldwide,  all  weather  navigation  system.  This  system  provides  accurate 
positioning  for  a  receiver  that  is  capable  of  receiving  signals  from  at  least  four 
satellites  [28]. 

The  main  part  of  the  GPS  signal  is  a  coded  message  that  is  simply  a  clock 
signal.  This  coded  message  and  the  time  that  this  message  was  sent  is  completely 
known  by  the  receiver.  The  receiver  measures  the  time  when  this  signal  is  received 
and  from  that  measures  the  travel  time  and,  consequently,  the  distance  between 
the  receiver  and  the  corresponding  satellite.  Since  the  clocks  in  the  satellites  and 
in  the  receiver  are  never  synchronized  the  measured  distance  is  not  the  true  range. 
In  GPS  literature  this  distance  is  called  pseudorange. 

All  generated  signals  including  the  carrier  in  GPS  are  synchronized  with  the 
main  atomic  clock,  therefore  the  carrier  phase  (if  known  completely)  can  also  be 
used  as  a  ranging  signal. 

The  accuracy  of  the  positioning  depends  on  many  factors  including  the  type 
of  user,  the  quality  of  the  receiver,  and  the  positioning  technique.  The  U.S.  De- 
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partment  of  Defense  deliberately  adds  uncertainty  to  the  positioning  signal;  for 
civilians  this  is  one  of  the  major  sources  of  error.  There  are  other  sources  of  er¬ 
ror  that  degrade  the  position  accuracy.  These  sources  include,  ionospheric  and 
tropospheric  delay,  satellite  position  uncertainty,  satellite  and  receiver  clock  bias, 
multipath,  and  the  usual  channel  noise  [44],  In  a  relatively  small  area,  for  exam¬ 
ple  distances  of  less  that  100  Km,  some  of  these  errors  are  highly  correlated  [45]. 
The  uncertainty  added  by  the  military,  satellite  clock  bias  1 ,  and  satellite  position 
uncertainty  are  clearly  the  same  for  all  users  that  are  using  the  same  satellite. 
The  tropospheric  and  the  ionospheric  delays  are  also  highly  correlated  in  short 
distances.  By  locating  a  receiver  in  a  known  position  one  can  estimate  the  com¬ 
mon  errors  and  send  the  correction  signal  to  the  other  users.  This  idea  caused  a 
revolution  in  satellite  aided  radio  positioning.  This  technique  is  called  differential 
GPS,  and  is  widely  used  for  surveying  as  well  as  real  time  navigation  [45]. 

Today’s  technology  allows  the  use  of  the  carrier  as  part  of  the  navigation  in¬ 
formation.  Due  to  the  periodic  nature  of  the  carrier,  one  can  only  measure  the 
phase  of  the  carrier,  modulo  2n,  i.e.  the  PLL  can  never  measure  the  exact  phase. 
The  unknown  part  of  the  phase  is  known  to  be  an  integer  number  times  2n.  Since 
the  measurement  noise  for  the  carrier  is  much  smaller  than  the  measurement  noise 
for  the  clock  signal,  it  is  essential  that  we  should  estimate  the  exact  phase  of  the 
signal.  It  is  shown  that  in  the  case  of  differential  GPS,  the  receiver  can  estimate 
the  ambiguity  in  the  integer  number  of  unmeasurable  cycles.  This  method  is  called 
carrier  phased  differential  GPS. 

1After  May  2000  the  US  department  of  defense  eliminated  this  uncertainty  for  all  users.  The 
satellite  position  accuracy  though,  is  higher  for  military  users. 
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2.1  GPS  Signal  Structure 


The  GPS  signal  consists  of  a  clock  signal  and  a  navigation  message  that  are  ampli¬ 
tude  modulated.  Using  the  clock  signal  and  navigation  message,  one  can  estimate 
its  position,  if  at  least  four  satellites  are  in  view. 

Each  satellite  sends  the  clock  signal  in  two  different  bands,  L i  and  L2.  These 
signals  are  as  follows  [50]: 

L\(t)  =  a\Pl{t)Dl{t)cos{2'Kf1t)  +  b\C  /  Al(t)D(t)sin{fit) 

L\{t)  =  a2Pl  {t)Dl  (t)cos(2ir  f2t) 


Where: 

•  i:  Number  of  the  satellite. 

•  Pl(t ):  Precise  clock  signal  generated  with  a  random  number  generator  with 
frequency  10.23  MHz  and  a  period  of  38  weeks.  Each  satellite  has  its  unique 
code. 

•  C/Al:  Course  acquisition  code,  the  clock  for  non- military  positioning  gener¬ 
ated  with  frequency  1.023  MHz  and  a  period  of  1  ms. 

•  Dl(t ):  Navigation  data  with  a  bit  rate  of  50  bit/sec. 

•  /p  Carrier  of  L  { ,  f\  —  154*  10.23  M Hz  synchronized  with  the  central  clock. 

•  f2:  Carrier  of  L2l  f2  =  120  *  10.23  MHz  synchronized  with  the  central  clock. 

•  a\,bi,a2 :  Amplitudes  of  the  carriers. 
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The  GPS  receiver,  receives  the  signal  corrupted  by  noise  and  other  sources  of 
error.  The  raw  measurements  of  the  code  and  the  carrier  phase  can  be  presented 
as  follows  [30]: 

P\tk)  =  pl(tk)  +  c[dT(tk)  -  df(tk )]  +  Tl(tk)  +  P(tk)  +  Ei(tk)  +  e\tk) 

A &(tk)  =  pl(tk)  +  c[dT(tk)  -  dt\tk )}  +  T\tk)  -  P(tk )  +  E\tk)  -  AiV*  +  r)\tk) 

where 

•  tk  :  GPS  time  at  epoch  k. 

•  P  :  code  observation  (m). 

•  %  :  satellite  number. 

•  p  :  distance  between  the  moving  object  and  the  satellite  position  (m). 

•  c  :  speed  of  light  (m/s). 

•  dT  :  receiver  clock  bias  (s). 

•  dt  :  satellite  clock  bias  including  Selective  Availability  (SA)  clock  error  (s). 

•  E  :  effect  of  ephemeris  error  including  SA  orbit  error  (m). 

•  /  :  ionospheric  delay  (m). 

•  T  :  tropospheric  delay  (m). 

•  e:  code  observation  noise  (m). 

•  A  :  carrier  wavelength  (m). 

•  <f>:  carrier  phase  observation  (cycles). 
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N  :  integer  ambiguity  (cycles). 


•  rj  :  carrier  observation  noise  (m). 

Access  to  the  above  observations  depends  on  the  type  of  user  and  the  quality 
of  the  receiver. 


2.2  Single  and  Double  Differencing  in  GPS 

State  of  the  art  receivers  can  have  access  to  code  and  carrier  phase  measurement 
of  12  satellites  in  2  frequencies.  In  this  kind  of  receivers  a  big  portion  of  the 
ionospheric  delay  can  be  corrected  and  removed  [28,  33].  Since  the  receiver  clock 
bias  is  the  same  for  the  observation  from  all  satellites,  the  error  due  to  the  receiver 
clock  bias  can  be  completely  removed  by  single  differencing.  In  single  differencing, 
the  receiver  subtracts  code  and/or  phase  measurement  of  one  satellite  from  the 
others  [53].  Single  differencing  eliminates  a  major  source  of  error.  If  it  is  possible 
to  mount  a  GPS  receiver  in  a  known  location  (i.e.  base),  one  can  use  the  double 
differencing  method  to  eliminate  other  sources  of  error.  Within  short  distances, 
ionospheric  and  tropospheric  errors  are  highly  correlated,  and  can  be  eliminated 
by  making  a  difference  between  the  code  and  the  carrier  phase  measurement  of 
the  base  and  the  moving  receiver.  The  appropriate  length  scale  for  this  is  not 
very  clear  and  it  depends  on  sunspot  activity  [33].  When  the  activity  is  low,  the 
short  distance  could  cover  larger  areas  and  conversely.  It  can  be  shown  that  double 
differencing  reduces  ephemeris  error  by  a  factor  of  d/r  [53],  where  d  and  r  are  the 
distances  from  the  moving  object  to  the  base  and  to  the  satellite,  respectively. 
Using  the  operator  (•)^-  =  [(•)/  —  (•)*]  —  [(•)*  —  (•)*■],  where  i  and  j  are  indices  for 
the  receivers  and  /  and  k  are  indices  for  satellites.  Then  double  differencing  can 
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be  written  as  follows: 


pfc,/  _  k,l  .  k,l 

*,j  ~  Pi, 3  ei,j  i 


(2.1) 


and, 


\  xfc,/  AkZ  .  \  7 vrAhZ  , 

—  Pij  +  +  ViJ- 


(2.2) 


In  the  above  formula  the  time  index  is  not  shown  for  simplicity.  If  the  short 
baseline  assumption  is  not  applied,  the  canceled  term  will  show  up  in  the  double 
difference  observer  and  should  be  estimated  [53] .  Sometimes  there  are  not  enough 
observations  to  estimate  all  of  these  terms.  In  this  case, we  are  forced  to  consider 
these  terms  as  noise  terms.  Multi-path  is  another  source  of  error  that  cannot  be 
removed  from  the  observation  (2.1)  and  (2.2)  [7]. 

Unlike  other  terms,  if  no  cycle  slip  occurs,  the  integer  ambiguity  is  constant 
with  respect  to  time.  The  fact  that  the  integer  ambiguity  is  constant  in  time 
is  very  important,  in  fact,  all  integer  ambiguity  resolution  methods  rely  on  this 
property.  Once  this  integer  number  is  known,  the  phase  measurement  can  be  used 
for  positioning.  We  should  remember  that  although  equation  (2.1)  does  not  have 
integer  ambiguity  in  it,  still  the  energy  of  the  noise,  e*j,  is  an  order  of  magnitude 
higher  than  the  energy  of  the  noise  in  (2.2)  [44],  Therefore,  the  integer  ambiguity 
problem  remains  intact. 

Although  double  differencing  eliminates  many  sources  of  error,  it  is  not  nec¬ 
essarily  the  best  way  of  handling  the  measurement.  Double  differencing  reduces 
the  number  of  observation  equations  which  may  not  be  the  optimum  choice  for 
certain  applications.  Therefore,  if  a  good  model  for  a  specific  error  exists,  we  can 
use  this  model  to  estimate  the  error  instead  of  eliminating  it  through  the  double 
differencing  operation.  As  we  mentioned  earlier  single  differencing  eliminates  the 
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error  due  to  receiver  clock  bias  by  subtracting  the  phase/code  measurement  of 
one  satellite  from  the  others,  but  this  reduces  the  number  of  measurement  equa¬ 
tions.  We  can  remove  this  part  from  the  double  differencing  operation,  i.e.  we  can 
only  subtract  the  measurements  of  one  receiver  from  the  base  (the  receiver  in  a 
known  location).  In  this  case,  we  can  eliminate  a  good  portion  of  the  error  due  to 
ionospheric,  tropospheric,  ephcmeris,  and  satellite  clock  bias.  The  receiver  clock 
bias,  (IT,  can  be  modeled  by  a  second  order  system  driven  by  a  Brownian  motion 
process.  In  Chapter  5  we  use  this  model  for  estimating  the  position  of  the  receiver 
as  well  as  the  clock  bias  for  an  integrated  INS/GPS. 


2.3  Cycle  Slip  in  Carrier  Phase  Measurement 

Carrier  phase  measurement  enables  a  GPS  receiver  to  reach  centimeter  level  ac¬ 
curacy.  This  is  true  only  if  the  exact  phase  is  measured.  In  addition  to  this,  the 
receiver  should  track  the  phase  at  all  times  to  be  able  to  measure  the  exact  phase. 
This  task  is  done  by  the  PLL  built  in  the  receiver. 

When  the  receiver  is  turned  on,  the  fraction  of  the  phase  (i.e.  the  difference 
between  the  satellite  transmitted  carrier  and  a  receiver  generated  replica  signal) 
is  observed  and  an  integer  counter  initialized.  During  tracking,  the  counter  is 
incremented  by  one  whenever  the  fractional  phase  changes  from  2n  to  0.  Thus,  at 
a  given  time  the  observed  accumulated  phase  is  2ti  times  the  sum  of  the  fractional 
phase  and  the  integer  count.  The  initial  integer  number  of  full  cycles  between 
the  satellite  and  the  receiver  is  unknown.  This  phase  ambiguity  remains  constant 
as  long  as  no  loss  of  signal  lock  occurs.  If  a  loss  occurs,  the  integer  counter  is 
reinitialized  which  causes  a  jump  in  the  instantaneous  accumulated  phase  by  an 
integer  number  of  cycles.  This  jump  is  called  a  cycle  slip  which,  of  course,  is 
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restricted  to  phase  measurements  [28]. 

Three  sources  of  cycle  slips  can  be  distinguished.  First,  cycle  slips  are  caused 
by  obstruction  of  the  satellite  signal  due  to  trees,  buildings,  bridges,  etc.  This  is 
the  most  frequent  source  of  cycle  slip.  The  second  source  for  cycle  slips  is  a  low 
signal  to  noise  ratio  due  to  bad  ionospheric  conditions,  multipath,  high  receiver 
dynamics,  or  low  satellite  elevation.  A  third  source  is  a  failure  in  the  receiver 
software  which  leads  to  incorrect  signal  processing.  Cycle  slips  could  also  be  caused 
by  malfunctioning  satellite  oscillators,  but  these  are  rare  [28]. 

Cycle  slip  detection  is  a  very  important  part  of  a  navigation  system  that  is 
based  on  carrier  phase  GPS.  If  a  cycle  slip  is  not  detected  correctly  the  position 
given  by  the  navigation  system  is  not  reliable.  In  Chapter  8  we  propose  a  new 
method  that  has  the  potential  ability  of  cycle  slip  detection  even  under  conditions 
when  the  number  of  satellites  is  below  a  critical  number. 
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Chapter  3 


Nonlinear  Filtering:  An 
Introduction 


Filtering  problems  consist  of  “estimating”  a  process  {xt}  (or  a  function  of  it) 
given  the  related  process,  {y* } ,  which  can  be  observed  [18].  The  observation  is 
available  in  an  interval,  i.e.,  {ys,0  <  s  <  t}  and  the  function  of  the  state  is 
estimated  at  time  t.  Except  for  the  linear  Guassian  system  and  very  special  cases  in 
nonlinear  settings,  estimating  the  state  given  the  observations  results  in  an  infinite 
dimensional  filter  [46].  Therefore,  approximation  methods  of  finite  dimension  are 
very  appealing. 

The  most  widely  used  approximate  filtering  method  is  the  extended  Kalman 
filter,  which  is  a  heuristic  approach  based  on  linearization  of  the  state  dynamics 
and  the  observation  near  the  nominal  path  [46].  EKF  is  computationally  simple 
but,  the  convergence  of  the  estimated  conditional  density  to  the  actual  conditional 
density  is  not  guaranteed. 

Projection  filtering  is  another  approximation  method  [9,  11,  12,  13].  In  projec¬ 
tion  filtering  it  is  assumed  that  the  conditional  density  of  the  state  of  the  system 


21 


can  be  approximated  by  a  member  of  a  parametric  family  of  densities.  In  this 
case,  estimating  the  conditional  density  is  equivalent  to  estimating  the  parameter 
of  the  family.  In  [11]  the  exponential  family  of  densities  is  chosen  as  the  parametric 
family.  In  contrast,  the  approach  in  [9]  employs  a  Galerkin  approximation  to  solve 
the  Fokker-Planck  equation  [46],  between  measurement  epochs. 

Particle  filtering  is  an  approximation  method  for  nonlinear  filtering  and  it  is 
based  on  the  Monte  Carlo  method;  in  this  method,  the  particles  at  time  tt  are 
i.i.d.  random  vectors  that  are  distributed  according  to  the  empirical  conditional 
distribution  of  the  state,  given  the  observations  up  to  time  tt.  These  particle/state 
vectors  are  used  in  the  state  equation  to  find  the  values  of  particles  at  time  U+\. 
Then  at  time  1,  the  empirical  distribution  is  evaluated  according  to  the  values 
of  the  particles.  The  new  observation  at  time  £j+1  is  taken  into  account  through 
Bayes’  Rule  to  calculate  the  conditional  empirical  distribution,  this  process  is  then 
repeated.  In  [40]  it  is  proved  that  by  tracking  a  large  enough  number  of  particles, 
one  can  get  an  approximate  conditional  distribution  that  is  arbitrarily  close  to  the 
true  conditional  distribution. 


3.1  Problem  Setup 

We  assume  that  all  stochastic  processes  are  defined  on  a  fixed  probability  space 
(f2,  F,  P),  and  a  finite  time  interval,  [0,  T],  on  which  there  is  defined  an  increasing 
family  of  cr-fields,  {Ft,  0  <  t  <  T}.  It  is  assumed  that  each  process,  {xf},  is 
adapted  to  Ft,  i.e.,  {xi}  is  ^-measurable  for  all  t.  We  assume  that  {xt}  is  a 
vector  diffusion  process  of  the  form 

xt  =  x0+  [  f 's(xs)ds  +  (  Gs(xs)dws,  (3.1) 

Jo  Jo 
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where  xt  G  lZn,  and  wt  G  lZq  is  a  vector  from  an  independent  Brownian  motion 
process;  the  second  integral  is  in  the  Ito  sense  [49],  and  the  function  f)(-)  and  the 
matrix  Gt(-)  have  the  proper  dimensions.  The  observation,  yt,  is  a  discrete  time 
process  given  as  follows: 


y«r  h„  (xnT)  T  vn, 


(3.2) 


where  ynr  G  lZd,  and  vn  G  7Tl  is  a  discrete  time  white  Gaussian  noise  process  with 
zero  mean  and  known  covariance  matrix.  The  state  dynamics  and  observation 
equations  can  be  rewritten  formally  as  follows: 


cbq  =  ft(pct)dt  +  Gtfx.t)dwt,  given  the  distribution  of  x0 

(3.3) 

yn.r  =  hn(xnT)+vn 

The  noise  processes  {wt,  t  >  0},  and  {vn,  n  =  0, 1,  •  •  •}  ,  and  the  initial  condition 
x0  are  assumed  to  be  independent.  We  use  Qt  and  Rn  for  the  covariance  matrices 
of  the  processes  w(  and  vn,  respectively.  We  assume  that  Rn  is  invertible  for  all 
n’s.  We  have  the  following  additional  assumptions  [25]: 


A  3.1.1  [local  Lipschitz  continuity]  V  x,  x '  G  Br  and  t  G  [0,  T\,  where  Br  is  a  ball 
of  radius  r,  we  have 


||ft(x)  —  ft  (x7)  ||  <  fcr||x  —  x'H ,  and 

\\Gt(x)QtGf(x)  Gt(x,)QtGf(x,)||  <  fcr||x  -  x'||. 

A  3.1.2  [Non-Explosion]  There  exists  k  >  0  such  that 

xIb(x)  <  ^(l  +  llxl|2);  and 

trace{Gt(pT)QtGf  (x))  <  k(l  +  ||x||2). 


(3.4) 


(3.5) 


V  t  G  [0,  T]  and  VxG  7 Zn. 
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Under  Assumptions  (A3. 1.1)  and  (A3. 1.2),  there  exists  a  unique  solution  {xt, 
t  G  [0,  T]}  to  the  state  equation,  and  xt  has  finite  moment  of  any  order  [25]. 

In  addition  to  these,  we  assume  that  the  probability  distribution  of  the  state  xt, 
given  the  observation  up  to  time  t,  7 q(dx)  =  P(xt  G  dx|y*),  where  y*  =  |yn,  i  = 
1 ,  •  •  •  ,n,  nr  <  t},  has  a  density  pt  with  respect  to  the  Lebesgue  measure  on  lZn. 
Then  {pt,  t  >  0}  satisfies  the  following  partial  differential  equation  and  updating 
equations  [11]: 


where 


jj-tpt  =  C*pt  nr  <t  <  (n  +  1  )r,  and 

Pnr  Cn\l/  nPnT~ 


(3.6) 


r;(K)  =  -  e?_,  1  e”s=1 

[a?]  =  GtQtGf, 

*n(x)  =  exp  -  h„(x))Tfl-1(y„r  -  h„(x)))  , 

and  cn  is  a  normalizing  factor. 

Except  for  the  linear  Gaussian  case,  and  some  very  special  nonlinear  cases, 
solving  System  (3.6)  constitutes  an  infinite  dimensional  filter  [46].  Therefore,  for 
practical  problems  it  is  necessary  to  approximate  the  conditional  density  in  (3.6). 
In  the  next  section,  we  discuss  one  of  these  approximation  methods. 


3.2  Projection  Filtering  on  Exponential  Families 
of  Densities 

This  section  is  mainly  a  review  of  the  results  we  use  from  [11],  We  start  this  section 
with  the  definition  of  the  exponential  family  of  densities. 
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Definition  3.2.1  Let  {ci,---,cp}  be  affinely  independent  1  scalar  functions  de¬ 
fined  on  7 Zn,  arid  assume  that  the  convex  set 

@o  =  j#  G  Vf  :  T (9)  —  log  J  exp  (0T c(x))  dx  <  oo| , 

has  nonempty  interior.  Then, 


S  =  {pM), 

p(-x,9):=exp  9Tc(x)  —  T(0)  , 

where  0  C  @0  is  open,  is  called  an  exponential  family  of  probability  densities. 


We  denote  by  S2  the  space  of  square  roots  of  the  densities  in  S  ,  i.e.,  S  2  = 

6  G  0}-  G  then  G  L2-  The  functions  i  = 

1,  •  •  •  ,p  form  a  basis  for  the  tangent  vector  space  at  \Jp(-,  0)  to  the  space  S%,  i.e., 
the  tangent  space  at  \Jpf,  9)  is  given  by  [4]: 

1  dp(-,9)  1  dp(-,9) 


\ fpifi) 


S2  =  span 


2 fA~e]  se  1  '  '  2vyM)  dK 


(3.7) 


The  inner  product  of  any  two  basis  elements  is  defined  as  follows 


dpi  .8) 


2y/p(-,8)  "i  ’  2 y/p(-,e)  dei 


1  dpi, 8)  \  _  1  r  1  dp(x,8)  dp(x,8)  j 

4  J  p(x,8)  d8,  ddj  U 


=  3  suW 


(3.8) 


It  can  be  easily  seen  that  g{9)  =  (gij(9))  =  (E[cicf\  —  E[cf\E[cj\)  is  the  Fisher 
information  matrix  of  pf,  9). 


Any  member  of  L2  can  be  projected  to  the  tangent  space  L 


\/pifi) 


■S'2  according 


to  the  following  projection  formula 


n 


8  ■ 


Lo  V 


"\/pi,0) 

p  p 

C  E 

i  1  j  1 


S* 


dpj.8) 


dpi, 8) 


(3.9) 


2  a/p(-,0) 

1{ci,  •  •  • ,  cp}  are  affinely  independent  if  for  distinct  points  Xi,  X2,  •  •  • ,  xp+i,  Ef=i  A»c(xj)  =  0 


and  Ef=i  E  =  0  implies  Ai  =  A2  =  •  •  •  =  Ap+i  =  0  [17]. 
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Projection  filtering  seeks  a  solution  pt  for  (3.6)  that  lies  in  S.  Of  course,  this 
solution  is  only  an  exponential  density,  but  we  hope,  by  choosing  the  proper  family, 
to  keep  the  approximation  error  small  (in  the  L 2  sense). 

If  we  consider  the  square  root  of  the  density  in  (3.6),  we  get 


_  1  d pt 

dt  Zy/Pt  dt 


(3.10) 


Define  atfi  =  •  We  assume  that  for  all  9  G  O  and  all  t  >  0,  Ep(.}g){\at)e\2}  < 

oo,  which  implies  that  is  a  vector  in  L2  for  all  9  G  O  and  all  t  >  0  [11]. 

Now  assume  that  in  equation  (3.10),  for  {y fpl,  t  >  t0},  starting  at  time  nr 
from  the  initial  condition,  ^pnT  =  \Jp(-,  9nr )  G  for  some  9nr  G  0.  Under  these 
assumptions,  the  right  hand  side  of  (3.10)  is  in  L2)  which  can  be  projected  into 
the  finite  dimensional  tangent  vector  space  L  — -5^.  The  propagation  part  of 

the  projection  filter  for  the  exponential  family,  S,  in  the  interval  [nr,  (n  +  l)r),  is 
defined  as  the  solution  to  the  following  differential  equation  in  the  same  interval: 


d\/Pt{-A)  =n  C*tPt(;6t) 


(3.11) 


9t  "2  y/pti-A) 

We  also  assume  that  hn(x)  in  equation  (3.2)  is  time  invariant,  i.e.,  h„(x)  = 
h(x),  and  the  components  of  h(x),  /i*(x),  and  ||h(x)|||,_i  are  linear  combinations 


of  q(x),  i  =  1,  -  -  - ,  p: 


^llh(x)lln-i  =  HA°D(X)  and  hk(~x)  —  ^  Afq(x),  fc  =  l,---,  d  (3.12) 

“  i=  1  i=l 


where  ||x|| ^  =  \/xrdx.  Then,  if  vn  is  stationary  with  the  covariance  matrix 
Rn  =  R ,  the  likelihood  function  'Pn(n)  can  be  written  as  follows: 


^n(x)  =  exp(— i(y lTR  1ynr))  exp(— |(hT(x)i?  Xh(x))  +  (y ^rR  xh(x))) 


=  An  exp 


p  ,  p 


E  A?c,(x)  +  E  (E  A ^T)ci(x) 

i= 1  k= 1  i=l 


(3.13) 
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where  znT  =  yJlTR~] .  and  An  is  a  constant  depending  on  ynr.  Therefore,  the 
coefficient  \hn(x)  is  a  member  of  exponential  family  of  densities.  This  family  is 
closed  under  multiplication.  Using  all  of  these  facts,  we  can  present  the  following 
theorem  [11]: 


Theorem  3.2.2  [Brigo  1996]  For  system  (3.3),  where  wt  is  a  Brownian  motion 
process  with  covariance  Qt  and  Vj  is  a  white  Gaussian  noise  with  covariance  R, 
we  assume  (A3. 1.1)  and  (A3. 1.2)  to  be  true.  We  also  assume  that  |||h(x)||^_i  = 
E  A°q(x)?  hk(x.)  =  E  A J'Ci(x),  for  k  =  1,  •  •  -,d,  and  Ep{.fi) ||2  <  oo,  VO  e 

i= 1 


0,  Vt  >  0.  Then  for  all  0  e  0,  and  all  t  >  0,  I j  E_£U£)  is  a  vector  on  the 
exponential  manifold  S 2.  The  projection  filter  density,  pf  =  Pt(-,@t )  is  described 

by 

&  m. 7  a  \  /  si.\ 

nr  <t  <  (n  +  l)r 


dy/ptbfit)  _ 
dt 


—  n0t 


'VU-A)  ’ 

Pnr(’)  b)nr)  Gi ^ n ( Y nr )Pm — ( ’ 7  0n7 — )  , 

and  the  projection  filter  parameter  satisfies  the  following  combined  differential  and 
stochastic  difference  equations: 


g(0t)d0t  =  Eot{£tc}dt, 


nr  <t  <  (n  +  l)r, 


Out  =  0nT-  -  +  Etl  Ag^, 


where 


and  Aq  =  [A1] ,  ■ 


n  B  l  n  B2 

c‘ 


• ,  \lp]T ,  i  =  0,  •  •  • ,  d,  and  zkn  is  the  kth  component  of  zfT  =  R  lynT. 


Henceforth,  we  shall  use  Eg  and  Ep(.j),  0nT  and  0n,  and  pnT  and  pn,  interchangeably, 
respectively. 
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Remark:  The  differential  equation  for  ft  is  an  ordinary  differential  equation 
with  the  vector  field  giOt)^1  Eet{Ctc} .  This  vector  field  should  be  computed  ana¬ 
lytically.  If  the  analytical  computation  of  this  vector  field  is  not  possible  an  off-line 
numerical  computation  should  be  carried. 

As  can  be  seen  from  the  statement  of  the  theorem,  the  calculation  of  the  con¬ 
ditional  probability  density  is  reduced  to  the  calculation  of  the  parameter  of  an 
exponential  family.  But,  solving  the  differential  equation  in  the  theorem  is  not 
an  easy  task.  At  each  moment  g(6t)  and  Egt{Ct c}  need  to  be  calculated.  This 
imposes  a  heavy  computational  load.  In  this  dissertation,  we  introduce  a  Monte 
Carlo  method  to  calculate  the  parameter  of  the  exponential  family  with  a  more 
affordable  computational  load. 

Although  projection  filtering  gives  a  better  solution  than  EKF,  there  is  no 
known  error  bound  with  which  we  can  compare  the  distance  between  the  real 
density  and  the  density  given  by  the  projection  filter.  In  the  next  section  we 
review  particle  filtering  as  an  alternative  to  optimal  nonlinear  filtering. 

Remark  :  The  assumption  on  h„(-)  and  Rn  in  this  are  made  only  to  ensure 
that  Tn(-)  is  in  the  family  of  exponential  densities.  These  assumptions  can  be 
relaxed  if  Tn(-)  is  guaranteed  to  stay  in  the  family. 

3.3  Particle  Filtering 

Consider  either  the  continuous  dynamics  and  discrete  observation  in  (3.3)  or  the 
discrete  case, 

xn_i_!  =  fn(xn)  +  Gn(xn)wn,  given  the  distribution  of  x0 

(3.14) 

yn=  h„(x„)+vn. 


We  assume  that  in  both  cases,  the  initial  distribution  for  xq  is  given.  The 


propagation  of  the  conditional  density,  at  least  conceptually,  can  be  calculated  as 
follows  [46]: 

•  Step  1  .  Initialization: 

p0(xo|yo)  =p(xo). 

•  Step  2  .  Diffusion: 

p(ri+i)_(xn+i|X,  )  =  j  p(xn+i|x„)pn  (xn|34)(ixn, 
where  yn  =  {y1?  y2,  •  •  • ,  yn}- 

•  Step  3  .  Bayes’  rule  update: 

P(yn+i|xn+i)p(n+i)_(xn+1|^n) 

P(n+ 1)  X"+1  n+1  /p(yn+i|xn+i)p(n+1)_(xn+1|^n)(ixn+1 

•  Step  4  .  n  <—  n  +  1;  go  to  Step  (2). 

The  conditional  density  given  by  the  above  steps  is  exact,  but  in  general  it 
can  be  viewed  as  an  infinite  dimensional  filter,  thus,  not  implementable.  Particle 
filtering,  in  brief,  is  an  approximation  method  that  mimics  the  above  calculations 
with  a  finite  number  of  operations  using  the  Monte  Carlo  method.  The  procedure 
for  particle  filtering  is  as  follows  [24,  40]: 

Algorithm  3.3.1  Particle  Filtering 

•  Step  1  .  Initialization 

o  Sample  x,[ ,-•••,  x^y  N  i.i.d.  random  vectors  with  the  initial  distribution 
^o(x). 

•  Step  2  .  Diffusion 


29 


❖  Find  x^+1,---,  x^+1  from  the  given  x*,  x^7  using  the  dynamic 

rules: 


or 


dxt  =  f t(xt)dt  +  Gt(xt)dwt,  nr  <t  <  (■ n  +  l)r 


x„+i  =  f„(xn)  +  Gn(xn)vn. 


Step  3  .  Find  the  empirical  distribution 


N 


N  ‘r-i  *»+i 

.7=1 


Step  4  ■  Use  Bayes  ’  Rule 


r>N 

>+l) 


•  'J'nH-l(x) 


N 

N  E  ,  (*n+l)  •  ^n+l(x^+l) 

j  =  1  n+1 


•  .Step  5  .  Resample 

o  Sample  x*+1,  •  •  • ,  x^+1  according  to  P^l!n+1(x) 

•  Step  6  .  n  <—  n  +  1;  go  to  Step  (2). 


where  <5v(w)  =  1  if  w  =  v  and  0  otherwise,  and  \l/n(x)  is  the  conditional  density 
of  the  observation  yn  given  the  state  x. 

It  is  customary  to  call  x^,  •  •  • ,  xjf  particles.  In  the  next  few  lines,  we  try  to 
explain  in  words  the  evolution  of  these  particles  using  the  above  algorithm. 

Let  xr'( ,  •  •  • ,  x^  be  the  distinct  particles  at  time  n  before  incorporating  the 
observation  at  time  n.  The  probability  of  each  particle  is  -L,  that  is,  is  uniformly 
distributed.  After  using  the  observations,  the  conditional  probability  of  each  par¬ 
ticle  changes.  Some  will  have  small,  and  some  large  probabilities.  Therefore,  in 
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the  process  of  resampling,  it  is  very  likely  that  some  particles  will  never  be  used 
and  instead  some  other  particles  (with  high  probabilities)  will  be  sampled  more 
than  once.  Therefore,  after  resampling,  some  particles  have  repeated  versions, 
but  in  the  diffusion  phase  they  go  through  different  paths  and  at  the  end  of  the 
diffusion  phase,  it  is  very  likely,  we  would  have  N  distinct  particles.  This  auto¬ 
matically  makes  the  approximation  one  of  better  resolution  in  the  areas  where  the 
probability  is  higher. 

In  [40]  it  is  proved  under  some  conditions  that 


lim  E 

AT— XX) 


Epjf(x)) 


(3.15) 


for  every  bounded  Borel  test  function,  /(•). 

One  problem  in  using  the  particle  filtering  method  is  the  computational  cost. 
In  particular,  for  a  high  dimensional  system,  getting  reasonable  accuracy  means 
using  a  large  N,  which  results  in  a  heavy  computational  cost.  In  the  next  chapter, 
we  propose  a  method  that  can  reduce  the  number  of  particles  for  a  certain  class  of 
problems. 
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Chapter  4 


Projection  Particle  Filtering 


In  the  previous  chapter,  we  saw  two  approximation  methods  for  nonlinear  filter¬ 
ing.  In  the  particle  filtering  method,  we  saw  that  the  conditional  distribution  is 
approximated  by  the  empirical  distribution.  Unlike  the  empirical  distribution,  in 
most  cases,  the  actual  conditional  distribution  is  smooth.  Intuition  suggests  that 
if  we  have  prior  knowledge  of  some  properties  of  the  distribution,  we  can  improve 
on  the  quality  of  the  estimates  over  just  using  the  empirical  distribution.  In  this 
chapter  first,  we  assume  that  the  conditional  density  lies  in  an  exponential  family 
of  densities.  We  will  see  that  with  this  assumption,  we  can  show  the  convergence  of 
the  approximated  density  to  the  actual  one.  Later,  we  relax  this  assumption  and 
we  only  require  that  the  conditional  density  stay  close  to  the  exponential  family  of 
densities.  We  prove  that  the  error  of  the  estimate  for  the  latter  case  is  bounded. 
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4.1  Particle  Filtering  for  Exponential  Families  of 


Densities 

For  System  (3.3),  we  assume  that  the  probability  density  of  xt,  given  the  observa¬ 
tion,  is  in  a  family  of  exponential  densities  S  1. 

With  this  assumption,  the  proposed  algorithm  is  as  follows: 

Algorithm  4.1.1  Particle  Filtering  for  an  Exponential  Family  of  Densities. 

•  Step  1  .  Initialization 

o  Sample  x,( ,  •  •  • ,  x(/;  N  i.i.d.  random  vectors  with  the  density,  p0(x). 

•  Step  2  .  Diffusion 

o  Find  x/+ ,,•••,  x^+1  from  the  given  x/,  •  •  • ,  x(/,  using  the  dynamic 
rule: 


djtt  =  it{xt)dt  +  Gt(x()dwt,  ir  <  t  <  (i  +  l)r 


Step  3  .  Find  the  MLE  of  0(n+ly  given  x/+1,  •  •  • ,  x^+1  [36] 

n 

0(n+1)-  =  arg max  exp(#Tc(x^+1)  -  T (9)) 


i=  1 


•  Step  4  .  Use  Bayes’  Rule 


p(x,6{n+1)) 


exp 

(^+i)-c(x)-T  (e{n+1)-)] 

)  ^n+l(x) 

/exp  ( 

>X+1,-c(x)-T(»(„+1)-)) 

^n+l(x)dx 

lrThis  assumption  is  rather  strong.  We  will  drop  this  assumption  later,  and  we  will  only 
assume  that  there  exists  a  known  family  of  densities  that  approximates  the  real  density  well,  i.e., 
with  acceptable  accuracy. 
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Step  5  .  Resample 


o  Sample  x*+1,  •  •  • ,  x^+l  according  to  p(x,  dn+i). 


•  Step  6  .  n  <—  n  +  1;  go  to  Step  (2). 


To  generate  x^+1,  •  •  • ,  x^+1,  a  Gibbs  sampler  can  be  used  [23].  This  brings  an  extra 
computational  cost,  which  should  be  taken  into  accoimt  when  choosing  Algorithm 
4.1.1  over  Algorithm  3.3.1. 

It  is  instructive  to  discuss  the  structure  of  the  ML  estimator.  We  are  going  to 
use  this  structure  for  the  proof  of  convergence. 

Let  x^,  ■  be  the  value  of  the  particles  right  before  the  measurement  at 

time  n.  The  MLE  of  9n,  9n,  satisfies  the  first  order  necessary  condition 

^  AT /x  Ci (x)  exp(^c(x) )dx  _  n 
i=i  fxexp(9Jnc(x))dx 

Therefore,  we  get 


N 


T7  E  cj(xn)  =  Een (ci(x))>  for  3  =  !,  •  •  •  ,P  ■ 

iV  i= i 


(4.1) 


Equation  (4.1)  says  that  the  sample  average  of  Cj(x)  and  its  probabilistic  average, 
evaluated  at  9n,  should  be  equal.  The  MLE  of  9  is  the  solution  to  the  system  of 
equations  in  (4.1).  Let  Fj[9)  be  as  follows: 

FM)=  lyvtxM  /cjWexp(9rc(x)Mx 

,{  Nh  ”  / exp(0Tc(x))dx  '  3  '  ,P' 

For  simplicity  we  drop  the  index  n  from  9n.  It  is  easy  to  see  that 


u  r  ■ 

~0f  =  ^(g(x)G'(x))  -  Ee(ci(x))Ee(Cj(x)). 

This  shows  that  (— =  <?($))  where  g(9)  is  the  Fisher  information  matrix  of 
the  exponential  density  at  9.  Since  Cj(x),i  =  1  are  affinely  independent 
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g(9)  >  0 ,  V0  G  0.  Therefore,  (4.1)  is  the  necessary  and  sufficient  condition  for 
optimality. 

In  the  next  few  pages,  we  prove  the  convergence  of  the  MLE  of  9n,  9n,  to  9n  in 
the  mean  square  sense. 

In  each  iteration  the  proposed  algorithm  starts  from  the  density  (x^y*), 
t  =  rn ,  where  9t  is  the  best  estimate  9t  according  to  the  algorithm.  After  a  full 
iteration,  the  algorithm  yields  9t+ 1  which  is  the  best  estimate  of  9t+ 1-  The  error 
in  9t+ 1  is  a  combination  of  the  series  of  possible  errors  for  which  we  want  to  find 
an  upper  bound.  The  first  source  of  error  is  the  error  in  9tl  which  will  propagate 
even  if  no  other  error  is  considered.  The  other  source  comes  from  the  fact  that  in 
each  iteration  new  particles  are  resampled  based  on  the  estimated  density  which 
is  different  from  the  actual  density.  Finally,  the  last  source  of  error  comes  from 
the  discretization  of  the  stochastic  dynamics  of  the  system.  We  want  to  emphasize 
that  here  we  assume  T„(x)  =  exp(-i(ynT  -  hn(xnT))TR~1(ynT  -  hn(xnT)))  lies  in 
the  chosen  family  of  densities.  Therefore,  no  other  error  is  added  to  the  estimate 
because  of  the  Bayes’  correction. 

We  recall  the  following  fact  [36]: 

Fact  4.1.2  For  the  family  of  densities  S  with  probability  density 

P(x,0)  =  exp(0Tc(x)  -  T(0)), 

the  Fisher  information  matrix  g (9)  =  ( E(ci(x)cj(x ))  —  E(ci(x))E(cj(x)))ij  is  pos¬ 
itive  definite.  Also  the  log  likelihood  function 

1(9)  =  9t C(x)  -  T(0), 
is  strictly  concave.  Therefore,  for 

ffi(x)  =  Ee[cj(*)],  j  =  l,---,p, 
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if  a  solution  exists2 ,  it  is  unique.  In  addition  if  x.  1,  •  •  • ,  nN  are  N  i.i.d.  random 
variables  distributed  according  top(x.,9),  then  the  MLE  of  6,  On,  is  asymptotically 
normal,  i.e. 

^  N 

0 N  =  argmax  n  p(xj,6>)  * 

o  i= 1 

VN(0n~0)  ~  A/"(0 ,g~\6)). 

Using  this  fact,  it  is  easy  to  see  that 

E  [\0N  -  Oj^j  =  ^  trace(g~1(0)), 

therefore,  when  N  — >  oo,  On  — >  0  in  the  m.s.  sense.  On  the  other  hand,  On  is 
the  solution  to  (4.1).  Using  the  strong  law  of  large  numbers  [10],  when  N  — »  oo 
the  LHS  in  (4.1)  goes  to  Eg(cj(x)),  j  =  1 ,  •  •  • ,  p.  with  probability  one.  In  other 
words,  the  solution  to  (4.1)  when  the  LHS  is  the  exact  Eg(cj(x)),  j  =  1 ,  -  •  - , p, 
gives  the  exact  solution  for  0.  Using  this  argument,  one  can  expect  that  by  finding 
a  good  estimate  of  the  left  hand  side  of  (4.1),  a  good  estimate  of  0  is  accessible.  In 
each  iteration  of  the  algorithm  presented  in  this  section  the  estimate  of  the  LHS 
of  (4.1)  is  found  by  using  the  Monte  Carlo  method  and  the  approximate  solution 
for  the  stochastic  differential  equation  (3.3). 

To  approximate  the  solution  to  the  stochastic  differential  equation  (3.3),  we 
employ  the  method  used  in  [39].  In  the  following,  we  review  this  method  briefly. 
The  stochastic  differential  equation  in  (3.3)  can  be  rewritten  as  follows: 

dxt  =  f t  (xt)  dt  +  Y^  St  (x0  dwrt,  (4.2) 

r=  1 

where  g[(-)  is  the  rth  column  of  the  matrix  Gt(-),  and  wrt  is  the  rth  component  of 
wt.  We  introduce  the  operators 

2In  [17]  it  is  shown  that  if  N  >  p,  the  solution  exists  almost  surely. 
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Lu  = 


1  q  n  n  02 

z  r=  1  i=l  j=l  C'XjG'Xj 


U, 


where  (a  ,  =  X!  Then,  the  approximate  solution  for  the  stochastic 

differential  equation  can  be  written  as  follows: 


Xfc+l  =  Xfc  +  E  #1  &/ia  +  Ukh  +  E  E  (Argr)n  tkh+ 


r= 1  r=l i=l 

5  E  (-kg’'  +  Qhi  +  (if) 


(4.3) 


r= 1 


/tfc  1  2  ’ 

where  h  is  the  step  size  and  the  coefficients  gf  ,»  f*fe>  (Ajgr)t  ,  etc.,  are  computed 
at  the  point  (tfc,Xfc).  Also,  the  sets  of  random  variables  ££,  are  independent  for 
distinct  k  and  can,  for  each  k,  be  modeled  as  follows: 

-1  ,i<j 

1  ,i>  3  ■ 

and  and  0  are  independent  random  variables  satisfying 

E(,  =  E(f  =  EQ  =  0.  E&  =  1,  E(f  =  3, 


e‘  =  -  \i «cct 


7v 


£6  =  PC3  =  0, 


PC2  =  0  =  1- 


In  particular,  can  be  modeled  by  the  law  P  (£  =  0)  =  |,  P  =  y/3j  =  |,  and 
P  (^  —  —V3j  =  |,  and  Q  can  be  modeled  by  P  (£  =  —1)  =  P  ((  =  1)  = 


Definition  4.1.3  We  say  that  a  function  u(-)  belongs  to  the  class  T ,  written  as 
u  G  T ,  if  we  can  find,  constants,  k  >  0,  and  n  >  0,  such  that  for  all  x  e  lZn ,  the 
following  inequality  holds: 


Before  we  present  our  results  we  need  to  specify  the  probability  space  in  which 
the  random  variables  are  defined.  As  we  mentioned  before,  the  stochastic  pro¬ 
cess  associated  to  the  dynamics  and  the  observation  equation  are  defined  on  a 
fixed  probability  space  (Q,F,P),  the  expectation  associated  to  this  probability 
space  is  denoted  by  E.  In  Algorithm  4.1.1  the  generated  particles  form  a  Markov 
process.  Similar  to  section  2.2  of  [40]  we  denote  the  probability  space  associated 
to  this  process  by  (IT,  F',  PA).  The  subindex  y  is  used  to  emphasize  that  the 
probability  measure  is  conditioned  on  the  observation  y.  Another  set  of  random 
variables,  £*,£*,  are  defined  for  the  approximation  of  the  stochastic  differential 
equation  (4.2).  We  denote  the  probability  space  associated  to  these  random  vari¬ 
ables  by  (Q" ,  F" ,  P").  The  expectation  associated  to  this  process  is  denoted  by 
E" .  Finally  we  define  (Q,  F,  P),  where  O  =  O  x  x  O"  and  F  =  F  x  Fl  x  F" .  For 
every  cD  e  we  define  u  =  (u>,  a/,  a/'),  then  for  every  A  e  F,  B  e  Fr,  and  C  G  F" 
we  define  the  probability  measure  P(A  x  B  x  C)  =  JA  (,/c  P[y]  (B)dP"  (u")^)  dP(u). 
The  expectation  with  respect  the  probability  measure  P  is  denoted  by  E. 

The  following  theorem  summarizes  the  weak  approximation  results  for  (4.3). 

Theorem  4.1.4  [  Milstein  [39]]  Suppose  (A3. 1.1)  from  Section  (3.1),  and  sup¬ 
pose  that  the  functions  f(-),  gr(-),  r  =  1  together  with  the  partial  deriva¬ 

tives  of  sufficiently  high  order,  belong  to  class  F .  Also,  suppose  that  the  functions 
A,;gr,  Lgr,  Arf,  and  Li  grow  at  most  as  a  linear  function  in  ||x||.  Then,  if  the 
function  u(-)  and  all  its  derivatives  up  to  order  6  belong  to  class  F ,  the  approx¬ 
imation  (4-3)  has  the  order  of  accuracy  2,  in  the  sense  of  weak  approximation, 
i.e., 

II Eu  (x0,x0  (4))  -  Eu  (x0,Xo  (4))  ||  <  Kh2,  tk  e  [0,  T], 
where  K  is  a  constant  and  x0iXo(-)  and  x0,x0(-)  are  the  exact  and  approximate 
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solutions  for  the  stochastic  differential  equation,  respectively. 

The  Monte  Carlo  approximation  of  Eu  (xo>Xo  (£&))  brings  another  error  term.  The 
combination  of  these  errors  can  be  expressed  as  follows: 

N  /  X 

Eu  (x0jXO  {tk))  -N  E  U  (Xq  ^  (tfc)J  < 

Eu  (x0  X()  (tfc))  -  Em  (x0>Xo  (4))  +  Eu  (xo,x o  (4))  -j?E«  x0  <  (4)  . 

i=i  v  0  7 

If  the  variance  of  u  (x0,Xo  (4))  is  bounded,  we  have 

~  ~  i  N  ,  .  b' 

E  Eu  (x0,xo  (tfc))  (x0>xi  (4))  <  A'h2  +  -^2 ,  (4.4) 

where  iC  and  k'  are  constants,  and  h  is  the  step  size  for  the  approximation  of  the 
solution  of  the  stochastic  differential  equation. 

The  next  lemma  relates  the  approximate  solution  to  the  stochastic  differential 
equation  and  the  estimate  of  the  parameter  6.  This  lemma  is  the  main  building 
block  for  our  result  in  this  section. 

Lemma  4.1.5  For  the  stochastic  differential  equation 

dx*  =  f t  (xt)  dt  +  Gt  (xt)  dwt,  x0,  t  €  [0,7/], 

assume  that  ft(-),  Gt(-)  are  such  that  for  the  Brownian  motion,  wt,  the  probability 
density  of  the  state  xt  lies  in  the  family  S  for  0  bounded,  with  g{6)  >  01  for  some 
0  >  0.  We  also  assume  the  conditions  in  Fact  4.1.2  and  in  Theorem  4.1.4  with 
c(x)  replacing  m(x) .  Then,  there  exist  k\  and  k.2  such  that 

E[\\9t  -  0t\\]  <  kih2  +  ,  t  G  [0,  tf]  (4.5) 

where  0t  is  the  estimate  of  6t,  and  N  and  h  are  the  number  of  particles  and  the 
time  step,  respectively. 
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Proof:  Let  do  be  the  initial  condition  for  6.  At  t  —  0,  N  independent  initial 
conditions  are  generated  based  on  the  density  p  (x,  0Q),  and  the  approximation 
method  (4.3)  is  applied.  From  (4.4)  we  know  that: 

1  N  U' 

S||S#,c(x,)-Fi:c(x;)  ||  <  Kh2  +  jwj2- 

i=  1 

On  the  other  hand,  from  (4.1),  we  know  that  6  is  a  solution  to  the  system  of 
equations 

1  N 

Jr  Y  fo(xt)  =  Egt(cj(*t)),  for  j  =  1,  •  •  •  ,p. 
i= 1 

N 

From  Fact  4.1.2.  the  solution  is  exact  if  we  replace  J2  c,-(xj)  by  Egt{Cj(xt)). 

i= i 

Subtracting  the  term  Egt(cj(x))  from  both  sides  of  the  above  equations  and  using 
the  vector  form  for  it,  we  get 
1  N 

Jr  EC(XD  -  ^t(c(xi))  =  EeM^  -  ^(c(x*))- 

i— 1 

On  the  other  hand,  we  know  that  Eg( c(x))  is  a  differentiable  and  one  to  one 
function  of  6  (  see  Fact  4.1.2).  The  derivative  of  this  function,  g(0),  is  positive 
definite  and  by  assumption  g{9 )  >  01 .  Therefore,  3cc  >  0  such  that 


Pt~0t\\  <  «ll^et(c(x0)  -^(c(x<))ll 

i  N 

=  a||^(c(xt))  -  E  £  c(x*)||. 

i= 1 

Taking  the  expectation  on  both  sides  of  the  inequality  we  have 


E\\et-e,\\  <  aB||iEc(x-)-a.(c(x1) 

1=1 


<  a[Kh2  +  ^ 


=  k\h 2  + 


ko 

N1/2 


❖ 


Now  we  are  ready  to  present  the  main  result  of  this  section. 
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Theorem  4.1.6  For  System  (3.3)  assume  that  ft(-)7  Gt(-)t  and  h(-)  are  such  that 
for  the  Brownian  motion  wt;  and  the  Gaussian  noise  vn,  the  conditional  proba¬ 
bility  density  of  the  state  xt,  conditioned  on  the  observations,  lies  in  the  family 
S  for  0  bounded  and  for  t  e  [0,  T].  Also  assume  the  conditions  in  Fact  4.1.2 
and  in  Theorem  4.1.4  with  c(x)  replacing  w(x).  Then,  if  g~l  ( 9t )  Egt  (£t  c  (x))  is 
Lipschitz  with  Lipschitz  constant  L  and  g{9)  >  01 ,  there  exist  l\  and  l2  such  that 

E\\0n  -  9n ||  <  exp {Lit)  yhh2  +  ,  nr  e  [0,  T], 

where  9n  is  the  estimate  of  9n,  and  N  and  h  are  the  number  of  particles  and  the 
time  step,  respectively.  This  inequality  implies  convergence  of  the  estimated  pa¬ 
rameter,  9n,  to  the  true  parameter,  9n,  as  h  — >  0  arid  N  — >  oo. 


Proof:  Let  9t  and  9t  be  the  actual  and  the  estimated  values  of  the  parameter  of 
the  density  at  time  t  =  nr,  respectively.  At  time  t  =  (n  +  l)r  the  error  in  the 
estimate  of  9t>  is  a  combination  of  the  error  in  the  estimate  in  9t  and  the  error 
added  in  the  time  interval  [t,t']. 

If  the  conditional  density  stays  in  the  exponential  family  of  densities,  9t  has  to 
satisfy  the  following  differential  equation: 

9  =  g -1  ( 9 )  ESt  (£tc  (x))  dt,  nr  <t  <  (n  +  1)  r. 

Let  9t /  be  the  estimate  of  9t> ,  if  the  error  due  to  resampling  and  the  approxima¬ 
tion  of  the  stochastic  differential  equation  solution  is  not  taken  into  account  in  the 
interval  [t,  t']  (i.e.  Of  is  computed  from  the  above  ordinary  differential  equation 
starting  at  9t),  then 

II  0t>  -9t>  ||  <  ||  0t>  —  0t>  ||  +  ||  Qt>  -9tf\\. 
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By  the  assumption  of  the  theorem,  g_1  ( 9 )  Egt  (. CtC  (x))  is  Lipschitz  with  Lip- 
schitz  constant  L,  then  by  continuity  of  the  solution  of  the  differential  equation 
with  respect  to  the  initial  condition  [31],  we  know  that 

0,'  -  0,-\  <  ||^-^||e^'-d, 

therefore, 

eU?  -eJ  <  E\\et 

Also  from  the  Lemma  4.1.5,  3ki(t')  and  fc2(0  such  that 

therefore, 

EPt'  -  #,'  II  <  E  I#,  -  #,||  e11''-11  +  h(t)h2  + 

The  observation  noise  vn  and  the  function  h(-)  are  such  that  Bayes’  Rule  does 
not  introduce  any  further  error  in  the  estimate  of  9t> .  More  precisely,  \l/n(x)  is 
assumed  to  be  a  member  of  S.  This  implies  that  after  applying  Bayes’  Rule  to 
p(x.,9t>)  and  p(x.,9t>)  parameters  9t>  and  9t>  are  shifted  with  the  same  vector  and 
therefore,  || 9t+>  —  (9t+'||  =  || 9t>  —  9p\\.  Here  t+'  represents  the  time  right  after  Bayes’ 
correction.  Therefore,  starting  from  the  initial  condition  9q  we  get 

E\\9n  -  9n ||  <  X!  exp(Lfr)  \^hh2  +  ,  nr  E  [0,  T } 

where 

k  =  max hj(nr),  nr  E  [0, T],  i  —  1,2. 

o 
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Here,  we  would  like  to  make  a  few  remarks: 


•  The  result  of  Theorem  4.1.6  can  be  easily  extended  to  convergence  in  the 
mean  square  sense. 

•  The  assumption  that  the  probability  density  stays  in  the  family  of  densities, 
S,  does  not  seem  very  realistic.  But  with  our  approach,  we  should  be  able  to 
get  the  result  in  [11],  In  fact,  in  [11]  the  evolution  of  the  density  is  forced  to 
stay  in  the  family  at  every  single  moment.  In  our  method,  we  only  force  the 
density  to  be  in  the  family  at  the  end  of  each  full  iteration,  i.e.  observation 
epoch.  This  allows  the  estimated  density  to  be  closer  to  the  actual  density. 

•  In  [11]  the  observation  equation  is  considered  to  be  time  invariant.  Here, 
the  time-varying  nature  of  h„  (x)  does  not  complicate  the  algorithm.  It 
surely  affects  the  assumption  that  the  density  stays  in  the  family,  but  as  we 
explained  earlier,  this  assumption  is  not  realistic  to  begin  with,  and  it  will 
be  dropped. 

•  If  -u(-)  is  in  E,  then 


lim  E  II Eeu(x.)  —  Eq*u(x)  ||  =  0. 

N - ►  oo  11  1 

h - >0 

This  is  a  criterion  similar  to  the  one  used  in  [40]. 
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4.2  Projection  Particle  Filtering  for  Exponential 


Families  of  Densities 

In  this  section,  we  drop  the  assumption  that  the  conditional  density  of  the  state 
given  the  observation  (3.6)  lies  in  the  exponential  family  of  densities,  S.  Also,  we 
do  not  require  that  H/n(x)  is  a  member  of  S.  Instead  we  make  other  assumptions. 
First  we  need  the  following  definition: 

Definition  4.2.1  We  say  that  a  function  u(-)  belongs  to  the  class  T}.K,  written 
as  u  G  TkK,  for  fixed,  k  >  0  and  k  >  0,  such  that  for  all  x  e  lZn ,  the  following 
inequality  holds: 


IKX)||  <  fc(l+  l|x|r). 

The  next  two  assumptions  are  to  guarantee  the  existence  of  an  exponential 
density  close  to  the  true  conditional  density. 

A  4.2.2  For  the  density  in  (3.6)  there  exists  an  exponential  family  of  densities  S 
such  that  Vt  G  [0,  T],  \/u  G  T}.K  39)  e  0*  and  e  >  0  such  that 

E\\EPt(u(x))  -£0.(u(x))||  <  e  ,  (4.6) 

where  0*  is  convex  3  and  compact. 

3It  is  easy  to  see  that  the  assumption  of  convexity  is  very  natural.  Assume  61,62  €  0*  then 
f  exp(#l’c(x))4x  <  00  for  i  =  1,2.  Therefore,  using  the  Holder  inequality  we  have 

f  exp((a0^f  +  (1  —  a)6^)c(x))dx  =  f  (exp(0^c(x)))“  (exp(fljc(x)))  (1  —  a) 

<  (/((eX P(Sfc(x)))  )  1  dfj  ^(ex p(0jfc(x)))  ^  dx'j 

=  exp(0^c (x))dxj  ^ J'  exp(0^c(x))dx^ 

<  00 

where  0  <  a  <  1. 
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A  4.2.3  For  6^-  in  (Af.2.2)  and  \l/n(x);  3T*(x)  such  that 


p(x,  g;_)^(x 
fp(x,  9*_)V*(x)dx 

is  in  the  family  S  for  some  9  G  0*  and  we  have: 


•  \/9  G  0*  and  Vu(-)  G  Tkn,  3e  >  0  such  that 

~..Eg'Sn(x)u(x)  _  Ee^*n{x)u{x) 

EgA/n(x)  Ee^*n(x)  ~ 

•  Vu(-)  G  F\zk,  3e  >  0  such  that 

Ee*_m*n(x)u(x)  Ep  _^n(x)u{x) 

Tp  1 1  _ n _  _  n _ 

Eg*  \P£(x)  Ep  ^n(x)  _ 

n  n~ 

From  Assumption  (A4.2.3)  it  is  clear  that  if  \l/*(-)  satisfies  the  requirements  of 
the  assumption  then  c'F*  (•)  satisfies  the  same  requirements,  where  c  is  a  positive 
constant.  Therefore,  without  loss  of  generality  we  assume  that  T*  (•)  =  exp(arc(-)) 
for  some  a  G  1ZP .  Using  Assumption  (A4.2.2),  we  can  state  the  following  fact. 


Fact  4.2.4  MQ\ ,  62  G  0*  andMu  G  Tkn,  3A1;  K2  positive  such  that 

II E01u(x)  -  Eg2u(x) ||  <  K1\\91  -  6>2|| 


(4.7) 


||«i-#2ll  <  V2||B»1c(x)-£;fcc(x)||  .  (4.8) 

Proof:  To  prove  (4.7),  define  fu(9)  =  Egu{x)  for  «(•)  G  EkK-  Then 

jQ-  fuiO)  =  EgCi(x)u(x )  -  EgCi(x)Egu(x). 

Since  ||u(x)||  <  k(  1  +  ||x||K)  and  6  G  0*,  where  0*  is  compact,  then  there  exists  a 
constant  A  such  that 

||%^ ||  <  A  Vu(-)  G  TkK  and  V0  G  0*. 
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Since  0*  is  convex  and  compact,  it  is  clear  that  3 Ah  independent  of  u(-)  such  that 
/u(x)  is  Lipschitz  over  0*  with  the  Lipschitz  constant  K\  [31]. 

Inequality  (4.8)  follows  from  the  fact  that  0*  is  compact  and  the  Fisher  infor¬ 
mation  matrix  g{6)  >  61  over  0*. 


Denote  the  interior  of  the  set  0*  by  0-^.  For  0-^  we  can  state  the  following  fact. 
Fact  4.2.5  The  set 

A  —  jo;  :  J  exp(o;Tc(x))  exp(dTc(x))dx  <  oo,  \/9  G  ®\nt  and  a  G  A.pj 
is  closed. 

Proof:  Assume  A  is  not  closed.  Therefore,  there  exists  a  converging  sequence 
{cq}  C  A  with  converging  point  a  ^  A,  then  36  G  0-^  such  that 

J exp(oTc(x))  exp(dTc(x))dx  >  M,  VM  G  1Z. 

Since  0-^  is  an  open  set,  3e  >  0  such  that  J\ft{6)  G  0-^.  Also,  since  {cq}  is  a 
converging  sequence,  3k  >  0  such  that  ak  G  Afe(a).  This  implies  that  6\  G  0-^ 
where  6\  =  6  +  a  —  ak-  Therefore, 

J  exp(a£c(x))  exp(6^c(x))dx  <  oo. 

On  the  other  hand,  we  know  that 

exp(o;^c(x))  exp(dfc(x))  =  exp(aTc(x))  exp($Tc(x))  . 

This  is  a  contradiction,  therefore,  A  is  closed. 


o 
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The  following  lemma  is  one  of  the  building  blocks  of  the  results  of  this  section. 


Lemma  4.2.6  For  6**_  and  \l/*(x)  defined  in  (Af.2.3),  and\/u(-)  G  3  positive 
numbers  ki,  k2l  k3,  independent  of  and  \l/*(x);  such  that  \/91,92  G  0*  the 
following  are  true. 


(a) 

h  <  WEeKWW  <  k2 

W9  G  0* 

(b) 

||EeT;(x)u(x)||  <  k3 

\/9  G  0*. 

(c) 

\\E01  ^(x)«(x)-4$ 

n(xMx)|| 

Proof:  Let  A4  be  a  set  defined  as  follows 


AA  =  {m  :  m  =  9i  —  d2,  V6*i,  62  G  ©*}• 

We  claim  that  A4  is  compact.  To  prove  this  claim,  assume  {irq}  to  be  a  sequence 

in  A4,  i.e  m,  G  JA.  Also  we  assume  that  lim  m,  =  m.  We  know  that  there  exist 

% - >00 

sequences  and  {6*2,1}  such  that  m;  =  9\^  —  02)i  and  6*i,j,  02)i  G  0*.  Since  0*  is 

compact  there  exist  converging  subsequences  and  {6*2,*}  in  0*.  This  implies 

that  m  =  61  —  02,  where  9 1  and  02  are  the  limits  of  the  subsequences  {9i,i}  and 
{02, i}-  Since  9 1  and  02  G  0*,  then  m  G  Af,  therefore  M.  is  closed.  Since  0*  is 
bounded,  Af  is  bounded  and  therefore,  it  is  compact. 

We  define  set  A\  as  follows: 

A\  —  jo  :  J  exp(aTc(x))  exp(0Tc(x))dx  <  00,  V6*  G  0*  and  a  G  77p  j . 

It  is  clear  that  A\  C  A.  As  we  mentioned  before,  without  loss  of  generality  we 
can  assume  \I/*(x)  =  exp(oTc(x))  and  from  Assumption  (A4.2.3)  it  is  clear  that 
a  G  Af|  AL  Since  A{~\AA  and  0*  are  compact  we  have 

min  min  ||E0T;(x)||  <  p0T;(x)||  <  max  max  ||A0T;(x)||. 
eee*  ae^t  f|  M  0e©*  f|  M 
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In  other  words  (a)  is  true  with  k\  =  min  min  \\Eg^/*  (x)  II  and 

0£0*  a&A  P|  M 

k2  =  max  max  \\Eg^>*  (x) II .  Similarly,  since  u(- )  G  (b)  is  true. 

eee*  a£Af]M 

Using  the  above  argument  and  the  argument  in  Fact  4.2.4,  we  can  show  that 
||  (x)u(x)||  is  uniformly  bounded  and  since  0*  is  convex  and  compact,  then 

(c)  is  true  [31]. 


o 

In  the  following  we  go  through  the  proof  of  the  theorem  that  we  state  later 
precisely.  Assume  9n  is  calculated  according  to  Algorithm  4.1.1  and  assume  p(x,  9n ) 
is  such  that  Vw  G 


B||^;«(x)-£».«(x)||  <  6,  (4.9) 

where  9*  (see  Assumption  (A4.2.2))  satisfies 

E\\EPnu(x)  -  £fl*u(x)||  <  e.  (4.10) 

Using  the  density  p(x,  9n),  new  particles  xr( ,  •  •  • ,  are  generated.  The  approxi¬ 
mate  solution  for  the  stochastic  differential  equation  at  time  (n  +  l)r  maps  these 
particles  to  x^+1,  •  •  • ,  x^+1.  From  these  new  particles  9n+ 1  is  calculated.  From 
(4.9)  and  (4.10)  we  have 

E\\EPnu(x)  *-  Efinu(x)\\  <  5  +  e.  (4.11) 

We  define  the  function  r(x)  as  follows: 

r(x)  =  U"c(x„iX((n  +  1))) 

where  x„,x((n  +  l)r)  is  the  approximate  solution  of  stochastic  differential  equation 
(4.2)  at  time  (n  +  1  )r  with  the  given  initial  condition  x  at  time  nr  using  the 
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method  in  (4.3).  Since  according  to  our  assumption  c  e  Thm  then  by  using  lemma 
9.1  in  [39],  we  have 

II r(x) ||  <  K3(l  +  ||x|H, 

where  K:i  and  fi  only  depend  on  the  function  c(-)  and  the  dimension  of  x.  We 
assume  that  r  e  If  the  argument  of  r(-)  is  a  random  variable,  then  using 

(4.11)  we  have 

E\\EPnr(x)  -^?nr(x)ll  <  S  +  e-  (4-12) 

More  explicitly, 

E\\EPnE"[c(xn,x((n  +  l)r))]  -  E^nE,,[c(xn>x((n  +  l)r)]||  <  5  +  e.  (4-13) 
From  Theorem  4.1.4  we  have 


E\\EPnc(xnA(n  +  l)r)) 


EPnE"c(5tn!X((n+l)r))\\  <  K4h 2, 


(4.14) 


for  some  K4  >  0. 

Using  the  Monte  Carlo  method  to  calculate  the  UPnc(xnx((n  +  l)r))  brings 
another  error  term  that  is  due  to  the  finite  number  of  particles  as  the  initial 
conditions  for  method  (4.3).  The  expectation  of  this  error  is  bounded,  i.e.  3K§  >  0 
s.t. 


~  1  N  K 
E\\E$  £"c(x„iX((n  +  1)t))  -  <—  ^c(x  ^((n  +  l)r))||  <  — J,  (4.15) 

i=1  1  TV  2 

where  x*  are  distributed  according  top(x,  0n).  Combining  (4.13),  (4.14),  and  (4.15) 
we  get 


U||Up„c(x„,x((n  +  l)r))  -  E  c(xn^(0 n  +  l)r))||  < 

(4. Id) 

5  +  e  +  K4h2  +  Ej. 


49 


Based  on  (A4.2.2),  we  know  that  3d*n+1)_  such  that 

£|lBwc(x)^BWc(x)l|S£'  (4'17) 

We  know  that,  if  x  (initial  condition  at  time  nr)  is  distributed  according  to  pn(x), 
then  Ep  _c(x)  =  £,p„c(xnjX((n  +  l)r)),  therefore,  from  (4.16)  and  (4.17)  we  get 

(n+1)  ’ 


1  iV  jy 

E\\Ee*  ^  c(x)  -  T7l>(xn+((™  +  1)t))II  <  S  +  2e  +  K4h2  +  — 


(n+l) 


N 


i=  1  N-2 

Then  9(n+u-  given  by  Algorithm  4.1.1  satishes  the  following  inequality 

K5 


E\\E, 


(n+l)’ 


c(x)  —  E-  c(x)  ||  <  5  +  2e  +  K4h  + 

V+1)-  _/V  2 


(4.18) 


(4.19) 


From  (A4.2.3)  we  know  that  36*  e  0*  such  that 


Eg*  ^;+1(x)w(x)  Ev  tf„+1  (x)it(x) 

(n+l)  (n+l) 

=  E 

Eeu( x)  -  £P(n+1)u(x) 

Eg*  V+l(x)  EP,  Vl(x) 

(n+l)-  (n+!) 

<-  e. 


Since  9  satishes  the  inequality  in  (A4.2.2),  we  can  choose  9*n+r)  to  be  9 ,  i.e. 


A* 

V+i) 


9. 


On  the  other  hand  we  have 


Eg *  w(x) 
V+1)  V  > 


E 


(n+l) 


MX 


Eg* 

(”+l)~ 

Eg* 

(n+l) 


;+i  (*m*) 

v+l(x) 


E~  <I'„+l(x)«(x) 

%+!)- _ 

B-  ^n+ltx) 

(»+«- 


< 


Eg*  ^ 

(”+l)~ 

Eg* 

(n+l) 


;+i(x)«(x) 

*;+i(*) 


Eg* 

(ra+l)~ 

EE* 

V+0- 


*+l(xMx) 

V  +  dX) 


+ 


Eg* 

(ra+l)~ 

(»+!)“ 


;+i(x)«(x) 

v+v) 


*;+i(x)u(x) 
V+i)- _ 


(n+l) 


fn+l 


(x) 


+ 


V  * 
V+i)- 

V 

(n+l)“ 


;+i(x)«(x) 

V+iW 


Eh  ^„+i(x)u(x) 

("+!)- _ 

Eh  ^n+l(x)  ’ 

V+i)- 
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therefore, 


Eg*  u( 

%+ 1)  v 


E 


(n+l) 


MX 


< 


\\Ee*  +i+i(x)u(x) 

("+i)~ _ 


ii^®*  *;+i(*)ini^  'fhiW 

(n+l)“  (n+l)" 


Bw^tx)-^+ir*;+1(x) 


+ 


ii^  ®;+i(*) 

%+i)- 


Eg*  _+*+1(x)m(x)  -  Eh  (x)n(x) 


(n+l) 


Jo. 

(n+l) 


+ 


-E?  +L|-i(xMx)  *„+i(x)it(x) 

(n+l) _  _  (n+l) 


^  ^;+l(x)  E~  ^n+l(x) 

(n+l)-  (n+1)- 


Using  Lemma  4.2.6  and  (A4.2.3)  we  get 

£|l%+1)«(x)  -  ^„+1+(x)H  ^  k*kA+2klkA E\\0ln+i)-  -  %+i)-||  +  e- 


?(n+l) 


kl 


Therefore,  from  (4.19)  and  Fact  4.2.4  we  get 

E\\0tn+1)-  -  hn+ D-  II  <  (S  +  2e  +  K^2  +  3 

V  iV  2 

This  implies  that,  3(+,  i2, 1 3,  (4  >  0  such  that 


K5 


E\\Eg*  m(x)  -  F+  «(x)||  <  ti<J  +  (2e  +  fc3/i2  +  <4^  2 . 

fn+l;  L,(n+ 1) 


The  next  theorem  summarizes  our  result  in  this  section. 


Theorem  4.2.7  For  the  system  (3.3)  assume  (A3. 1.1),  (A3. 1.2),  (A4-2.2),  arid, 
(A4-2.3).  We  also  assume  the  conditions  in  Fact  4.1.2  and  in  Theorem  4.1.4 
with  c(x)  replacing  u(x),  and  we  assume  r  €  Ff;K ■  Then  in  Algorithm  4.1.1  with 
approximation  (4-3),  ifWu(-)  G  EkK 

E\\Egnu(x)  -  Ee*u(x)\\  <  5 
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then 


E\\E0*n+iu(x.)  -  E?(n+iu(x)\\  <  nS  +  i2e  +  t3h2  +  l4N  2, 
for  some  t\,  l2,  l3,  t4  >  0. 

In  Theorem  4.2.7  only  one  step  of  Algorithm  4.1.1  is  considered;  it  is  straight¬ 
forward  to  then  use  Theorem  4.2.7  repeatedly  for  the  time  interval  [0,  T],  where 
T  =  Mr.  In  that  case,  || Egufx.)  —  Eq*u{x)  ||  <  So,  then  3a  1,  a2,  a3,  and  0:4  positive 
such  that 

E\\Eg*u(-x.((n)r))  -  E^u(x((n)r))||  <  aiS0  +  a2e  +  a3h 2  +  a4lV^1/2, 
for  0  <  n  <  M . 
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Chapter  5 


Application  of  Projection  Particle 
Filtering  in  Navigation 


In  this  chapter  we  use  the  approximation  methods  for  nonlinear  filters  introduced 
in  the  previous  chapters  in  position  estimation  for  systems  with  nonlinear  dynamics 
and  observation.  We  are  particularly  interested  in  the  situations  where  methods 
based  on  linearization  such  as  EKF  fail  to  provide  reasonable  estimates. 

In  the  first  part  of  this  chapter  we  address  the  problem  of  positioning  in  the 
presence  of  integer  uncertainty.  Such  uncertainties  arise  in  navigation  problems 
where  carrier  phase  differential  GPS  is  part  of  the  observations.  In  these  cases 
resolving  the  integer  ambiguity  is  essential  for  the  navigation  system. 

In  the  second  part  of  this  chapter  we  apply  projection  particle  filtering  to  an 
Integrated  INS/GPS.  We  show  that  when  the  number  of  visible  satellites  is  below  a 
critical  number  nonlinear  filtering  can  provide  an  accurate  estimate  of  the  position 
while  EKF  fails  to  converge. 
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5.1  Particle  Filtering  for  Nonlinear  Systems 
with  Constant  Integer  Uncertainty 

Consider  the  following  nonlinear  dynamics  and  observation 

dxi  =  b  (xt  )dt  +  Gt(xt)  dwt 
y  nr  =  h  „(x(nr))  +  Jn  z  +  vn 

where  the  assumptions  and  the  dimensions  for  xt,  ynT  ,  wt,  and  vn  are  the  same 
as  in  the  previous  sections.  We  assume  that  z  is  a  random  integer  vector,  i.e. 
z  G  Zm  and  Jn  has  the  proper  dimension.  Vector  z  is  assumed  to  be  constant  in 
time.  This  problem  can  be  set  up  in  discrete  time  as  well.  In  this  case, the  system 
dynamics  and  the  observation  can  be  written  as  follows: 

x„+i  =  fn(xn)  +  Gn(xn)wn 
yn  =  hn(xn)  +  Jnz  + vn 

In  both  setups  we  assume  that  the  integer  uncertainty  affects  only  some  compo¬ 
nents  of  the  observation,  and  other  components  are  unaffected  by  z.  The  affected 
components  have  associated  noise  components  in  vn  that  have  considerably  lower 
energy.  In  other  words,  the  uncertain  components  of  ynT  (or  equivalently  yn) 
would  be  considerably  more  accurate  than  the  other  components,  if  the  integer 
ambiguities  were  known.  This  suggests  that  an  accurate  estimation  of  z  can  in¬ 
crease  the  accuracy  of  the  estimate  of  the  state  of  the  system  significantly.  With 
this  explanation,  our  treatment  of  z  is  clear.  From  the  state  dynamics  and  the 
observation  equation  we  first  estimate  z  and  then,  with  fixed  z,  we  use  regular 
nonlinear  filtering  methods  to  estimate  the  state  of  the  system  xt. 

We  augment  the  state  xt  with  the  integer  ambiguity  z.  Having  done  that,  the 
state  dynamics  and  the  observation  have  the  following  form: 
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d 

Xt 

f i(x*) 

dt  + 

Gtfa-t) 

z  t 

0 

0 

(5.1) 


Ynr  =  hn(x(nr)  +  Jnz(nr))  +  vn. 

We  assume  that  the  initial  distribution  of  (xq,z^)t  is  known.  Now  the  state 
dynamics  and  the  observation  have  the  same  form  as  was  studied  in  Section  (3.3). 
Therefore,  we  can  apply  particle  filtering  to  find  the  conditional  probability  distri¬ 
bution  of  the  augmented  state.  This  setup  is  a  special  case  of  the  setup  in  Section 
(3.3).  In  (5.1)  there  is  no  state  transition  for  zt,  therefore,  using  particle  filtering 
in  its  original  form  may  not  be  the  best  option.  Recall  that  in  particle  filtering 
we  start  with  N  i.i.d.  particles  distributed  according  to  the  initial  distribution. 
In  the  resampling  part  the  low  probability  particles  die  and  the  high  probability 
particles  produce  many  particles  identical  to  themselves.  Since  zt  does  not  change, 
the  part  of  the  particles  associated  to  zt  tends  to  cover  smaller  and  smaller  por¬ 
tions  of  the  state  space.  In  fact,  the  state  space  of  the  integer  vectors  is  defined 
by  the  particles  at  the  initial  time.  This  problem  can  be  overcome  by  modifying 
the  algorithm  mentioned  in  Section  (3.3).  In  the  new  algorithm,  Step  5  is  changed 
in  such  a  way  that  the  particles  are  the  addition  of  the  original  particles  found 
by  Algorithm  3.3.1,  with  a  random  vector.  The  modification  is  very  important 
for  the  integer  values,  since  the  integers  do  not  have  a  dynamics  that  is  driven 
by  a  random  input.  In  [34],  a  similar  modification  has  been  used  for  the  regular 
nonlinear  filtering  setup.  It  seems  that  the  convergence  results  given  in  [34]  can 
be  applied  to  the  current  case  as  well. 

Based  on  the  modified  algorithm,  we  simulated  a  nonlinear  filtering  problem 
similar  to  the  problem  involved  in  the  GPS  system. 

In  a  two  dimensional  space,  three  transmitters  (imagine  three  pseudo  satcl- 
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lites)  are  mounted  on  three  known  points  (2000,  100000),  (0,  100000),  and  (-2000, 
100000).  The  moving  object  can  measure  its  distance  from  these  transmitters. 
For  each  pseudo  satellite,  two  types  of  measurement  are  possible:  One  with  high 
measurement  noise  and  the  other  with  low  measurement  noise.  For  the  low  mea¬ 
surement  noise,  though,  there  is  an  integer  ambiguity.  The  dynamics  of  the  moving 
object  for  this  example  is  considered  to  be  in  discrete  time  and  linear  time  invari¬ 


ant.  The  dynamics  and  observation  equation  is  given  as  follows: 


1  At  0 
0  1  0 
0  0  1 
0  0  0 


yan  =  l|x-Si|| +u“  ,*  =  1,2,3 

ybn  =  llx-s*ll  +ni  +  vbn  ,*  =  1,2,3, 

where  x  =  (xi,x2)T,  s,  is  the  position  of  pseudo  satellite  i  in  two  dimensional 
space,  A t  =  0.1  unit  of  time,  nt  is  the  integer  ambiguity  of  the  pseudo  satellite 
i,  and  w  =  (wi,W2,W3,W4:)T  and  v  =  (v±,  v\,  v%,  v$,  v$,  are  zero  mean  white 
Gaussian  noise  process  with  covariance  matrices  £w  =  diag  (1,  0.5, 1,  0.5)  and 
£v  =  diag  ( 5,  0.2,  5,  0.2,  5,  0.2),  respectively.  In  the  simulation,  it  is  assumed  that 
the  initial  condition  for  the  position  is  distributed  in  a  square  of  size  200  x  200 
units  squared,  symmetric  with  respect  to  the  origin. 

In  brief,  the  simulation  can  be  separated  into  two  parts,  initialization  and 
the  full  non-linear  filtering.  In  the  initialization  part,  we  start  with  the  initial 
probability  distribution  for  (xi,x2)  and  from  a  series  of  observations,  we  find  an 
estimate  for  the  probability  distribution  of  (vi,v2).  In  this  part,  we  do  not  use  the 
dynamics  of  the  moving  object.  Using  our  estimate  for  the  probability  distribution 
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of  (xi,  Vi,  X2,  V2)  we  find  the  distribution  for  the  integer  ambiguity.  After  this, 
the  initialization  is  over,  and  the  full  non-linear  filter  is  used.  There  are  some 
minor  numerical  considerations  that  we  would  like  to  point  out.  In  the  Bayes 
step  of  the  algorithm,  the  numbers  are  usually  very  small,  and  without  proper 
scaling  the  original  algorithm  would  not  work.  In  the  resampling  part,  one  can 
use  the  law  of  large  numbers  and  regenerate  the  particles  based  on  their  weight 
without  generating  random  numbers  that  are  time  consuming.  The  result  of  the 
simulations  are  shown  in  Figures  5.1,  5.2,  5.3,  5.4,  5.5,  and  5.6.  To  display  the 
estimated  integers,  we  simply  used  the  mean  value,  which  is  not  necessarily  the  best 
choice.  Of  course,  since  we  have  the  distribution,  we  can  use  the  MAP  estimate 
of  the  integers.  In  this  simulation  we  forced  one  of  the  integers  to  have  a  jump. 
Although  our  algorithm  is  not  designed  for  these  kinds  of  changes,  we  see  that 
it  can  estimate  the  new  integer  values.  In  future,  we  use  special  treatment  for 
the  times  when  these  kinds  of  jumps  happen.  As  we  can  see,  the  estimates  for 
the  integers  are  reasonably  good.  The  reliability  of  the  estimate  for  the  integers 
depends  on  the  energy  of  the  noise. 
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The  Estimated  and  Actual  Integer  Ambiguity 


Figure  5.1:  Estimated  integer  ambiguity  versus  the  actual  integer  ambiguity  of 
pseudo  satellite  (1).  At  time  100  there  is  a  cycle  slip  of  strength  -20  for  the 
measured  phase  of  the  carrier  from  pseudo  satellite  (1). 


Time 


Figure  5.2:  Estimated  integer  ambiguity  versus  the  actual  integer  ambiguity  of 
pseudo  satellite  (2).  At  time  100  there  is  a  cycle  slip  of  strength  -20  for  the 
measured  phase  of  the  carrier  from  pseudo  satellite  (1). 
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Figure  5.3:  Estimated  integer  ambiguity  versus  the  actual  integer  ambiguity  of 
pseudo  satellite  (3).  At  time  100  there  is  a  cycle  slip  of  strength  -20  for  the 
measured  phase  of  the  carrier  from  pseudo  satellite  (1). 


Th  estimated  x  component  versus  the  actual  x  component 


Figure  5.4:  Estimated  X\  component  versus  the  actual  x\  component  of  the 
position  of  the  car.  At  time  100  there  is  a  cycle  slip  of  strength  -20  for  the  measured 
phase  of  the  carrier  from  pseudo  satellite  (1). 
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The  estimated  y  component  versus  the  actual  y  component 


Figure  5.5:  Estimated  x 2  component  versus  the  actual  X2  component  of  the 
position  of  the  car.  At  time  100  there  is  a  cycle  slip  of  strength  -20  for  the  measured 
phase  of  the  carrier  from  pseudo  satellite  (1). 


The  actual  trajectory  versus  the  estimated  trajectory 


Figure  5.6:  Estimated  trajectory  versus  the  actual  trajectory  of  the  car.  At  time 
100  there  is  a  cycle  slip  of  strength  -20  for  the  measured  phase  of  the  carrier  from 
pseudo  satellite  (1). 
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5.2  Applications  of  Projection  Particle  Filtering 
for  an  Integrated  INS /GPS 

For  the  rest  of  this  chapter  we  assume  that  the  integer  ambiguity  resolution  prob¬ 
lem  is  resolved  (see  Chapter  7).  Therefore,  we  consider  the  observation  equation 
provided  by  the  ith  GPS  satellite  to  have  the  following  form: 

Vi  =  p\rx,  ry,  rz)  -  pl(bx,  by ,  bz)  +  c5  +  vl  ,  (5.2) 

where  [ bx ,  by ,  bz]T  is  the  known  base  coordinate,  6  is  the  combination  of  the  receiver 
clock  bias  in  the  base  and  the  rover,  and  vl  is  the  measurement  noise  for  the  ith 
satellite  signal. 

Here  we  would  like  to  mention  that  the  nonlinearity  in  measurement  is  not 
only  due  to  the  function  p.  As  we  explain  later  the  integrated  INS/GPS  requires 
coordinate  transformations  between  the  INS  parameters  and  the  GPS  parameters, 
which  contributes  to  the  nonlinearity  of  the  measurement. 

5.2.1  Coordinate  Systems 

Parameters  of  an  integrated  INS/GPS  are  expressed  in  different  coordinate  sys¬ 
tems.  In  this  subsection  we  intend  to  introduce  these  different  coordinate  systems 
and  the  transformation  from  one  to  another  [21], 

ECEF  frame 

The  GPS  measurements  are  given  in  an  Earth  Centered  Earth  Fixed  (ECEF) 
frame.  Two  different  coordinate  systems  are  common  for  describing  the  location 
of  a  point  in  the  ECEF  frame. 
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parameter 

value 

Description 

a 

6378137.0  m 

semi  major  axis 

b 

6356752.3142  m 

semi  minor  axis 

^ ie 

7.292115  x  10“5 

angular  velocity  of  the  Earth 

f 

c  a—b 

J  a 

flatness  of  the  ellipsoid 

e 

V?!1  -  /) 

eccentricity  of  the  ellipsoid 

Table  5.1:  Definition  of  the  parameters  for  WGS84  reference  frame 

The  usual  rectangular  coordinate  system  [px,Py,Pz]T  for  the  point  p,  herein 
referred  to  as  the  ECEF  coordinate  system,  has  its  x  axis  extended  through  the 
intersection  of  the  prime  meridian  (0°  longitude)  and  the  equator  (0°  latitude). 
The  z  axis  extends  through  the  true  north  pole  (i.e.  parallel  to  the  Earth’s  spin 
axis).  The  y  axis  completes  the  right-handed  coordinate  system. 

The  geodetic  coordinate  system  is  defined  according  to  the  familiar  latitude, 
longitude,  and  hight  coordinate  system  and  is  shown  by  \px,p4>,ph]T  ■  For  this 
system  of  coordinates,  the  Earth’s  geoicl  is  approximated  by  an  ellipsoid.  The 
defining  parameters  for  the  geoicl  according  to  the  WGS84  reference  frame  are 
given  in  Table  5.1. 

The  transformation  from  the  ECEF  geodetic  to  the  ECEF  rectangular  coordi¬ 
nate  systems  is  given  as  follows 

px  =  ( N  +  ph)cos  (jpx )  cos  (p* ) 

Py  =  (N  +  Ph)cos(Px)sin(P*)  (5‘3) 

Pz  =  (N(l  -e2)+ph)sin(px), 

where  N  =  ,  “  = .  The  inverse  transformation  can  be  derived  from  (5.3). 

sJl-e2sin2(px)  K  ' 
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Local  Geographical  frame 


It  is  convenient  to  express  the  navigation-frame  velocity  in  the  local  coordinate 
system.  This  coordinate  system  is  rectangular,  and  it  has  the  x  axis,  y  axis,  and  the 
z  axis  extended  through  the  north,  the  east,  and  the  down  direction,  respectively. 
With  this  definition  for  the  local  geographic  coordinate  system,  the  navigation- 
frame  velocity,  [wjv,  ve,  vd]t,  is  related  to  the  geodetic  rate  vector  according  to 


(  VN  \ 


( 


R\  +  ph 


o 


vE 


0  (i^+pJcos(pJ  o 


P, 


V  VD  )  V  0 


-1  A  A 


(5.4) 


where  R\  = - e  ^ — 3-,  and  Rs 

(l—e2  sin2(pA))5  * 


_ a _ 

(l-e2  sin 2(px))Z 


Platform  and  Body  frames 

The  measurements  by  accelerometers  and  gyros  are  expressed  in  the  platform 
frame.  For  simplicity  we  assume  that  the  axis  of  the  gyros  and  the  axis  of  the 
accelerometers  are  aligned  with  the  axis  of  the  platform  frame.  Also,  we  assume 
that  the  body  frame  and  the  platform  frame  are  aligned,  and  the  center  of  the 
coordinate  system  is  the  same  for  both  frames.  The  transformation  from  body 
frame  to  local  geographical  frame  is  calculated  at  every  moment,  and  it  depends 
on  the  angular  rate  change  measured  by  the  gyros,  the  rotation  of  the  Earth,  and 
the  rotation  of  the  local  frame  with  respect  to  an  inertial  frame,  all  expressed  in 
the  body  frame.  The  transform  matrix  from  the  platform  frame  to  the  local  frame 
is  expressed  as  follows 

~^Rb2g  =  Rb2gQgbi  (5-5) 
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where 


nb  — 

11 9b  ~ 


^  0  —  r  q  ^ 

r  0  —  p 

—qp  0 


(5.6) 


and  ubgb  =  [p,  g,r]T  is  the  inertial  angular  rate  expressed  in  the  body  frame,  u ;gb 


can  be  expressed  as  follows 
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(5.7) 


J 


where  [p,  q,  r]T  is  the  measured  angular  rate,  and  [bp,bq,br]T  is  the  bias  in  the 
angular  rate  measurement. 

If  we  assume  that  in  the  time  interval  [t,t  +  St] ,  Qbb  is  a  constant  matrix  then 
we  have 

Rg2b(t  +  St)  =  exp(-f tbgb{t)5t)Rg2b{t). 

Since  ttbgb  is  a  skew  symmetric  matrix,  then  exp (—Qbgb(t)8t)  has  a  simple  form: 

.  .  sin(||c<A(f)<5f||)  ,  1  —  cos(||c<A(f)<5f||)  ,  „ 

exp(-f!^t)  =  (/  +  +  1,^  "Hq. 

The  transformation  from  the  body  frame  to  the  local  frame,  Rb2g,  is  simply  the 
transpose  of  Rg2b,  i.e.  Rb2g  =  Rj2b. 


5.2.2  GPS  Clock  Drift  and  INS  Dynamics 

The  GPS  clock  drift  and  the  INS  equations  are  the  sources  that  contribute  to  the 
dynamic  equation  for  the  integrated  INS/GPS. 
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The  INS  dynamic  equation  can  be  expressed  as  follows. 
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where  g  =  9.780327m/s2  is  the  gravitational  acceleration,  [ au,av,aw]T  is  the 
accelerometer  measurement  expressed  in  the  body  frame,  [bu,bv,bw]T  is  the  ac¬ 
celerometer  measurement  bias  again  expressed  in  the  body  frame,  and  wv  is  a 
vector  valued  Brownian  motion  process  with  zero  mean  and  known  covariance 
matrix.  The  measurement  bias  is  assumed  to  have  the  following  dynamics 


‘  K  ' 


y  bw  j 


<h  ^ 


by 

y  '  bl'  J 


dt  +  dwf, 


(5.9) 


where  is  a  vector  valued  Brownian  motion  with  zero  mean  and  known  covariance 
matrix,  and  05  is  a  small  positive  constant. 

The  receiver  clock  drift,  dt,  is  represented  by  the  integration  of  an  exponentially 
correlated  random  process  gt  [16] 


dgt  =  —aegtdt  +  dwf 
ddt  ~  gtdt , 


(5.10) 
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Estimated  x  component  versus  actual  x  component 


Figure  5.7:  Comparison  of  the  estimated  and  actual  x  component  for  three  differ¬ 
ent  methods,  EKF,  particle  filtering,  and  projection  particle  filtering.  For  t  <  100, 
the  number  of  satellites  is  6,  for  100  <  t  <  400,  the  number  of  satellites  is  3,  and 
for  t  >  400,  the  number  of  satellites  is  4. 


with  ae  =  1/500  and  wf  is  a  process  of  Brownian  motion  with  zero  mean  and 
variance  cr2  =  10-24.  This  dynamic  model  is  typical  for  a  quartz  TCXO  with 
frequency  drift  rate  of  lCr9s/s  [16]. 

5.2.3  Simulation  and  Results 

In  this  section  we  present  the  simulation  results  for  an  integrated  INS/GPS.  Here 
we  apply  three  different  filtering  methods,  EKF,  particle  filtering,  and  projection 
particle  filtering  for  a  specified  exponential  density.  We  assumed  that  Rg2b  is 
perfectly  known,  i.e.  the  estimation  problem  regarding  the  gyro  measurements  is 
solved.  Therefore,  the  dimension  of  the  dynamical  system  in  this  simulation  is 
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Estimated  y  component  versus  actual  y  component 


Figure  5.8:  Comparison  of  the  estimated  and  actual  y  component  for  three  differ¬ 
ent  methods,  EKF,  particle  filtering,  and  projection  particle  filtering.  For  t  <  100, 
the  number  of  satellites  is  6,  for  100  <  t  <  400,  the  number  of  satellites  is  3,  and 
for  t  >  400,  the  number  of  satellites  is  4. 


eleven.  The  state  of  the  dynamical  system,  x,  is  given  as  follows 
x  =  \Px’P4,’Ph’vN>  Fe,  vd,  bu,  bv,  bw:  g,  5}T. 

The  differential  equation  describing  the  dynamics  of  the  system  is  the  combination 
of  the  differential  equation  in  (5.8),  (5.9),  and  (5.10).  Here,  we  assume  that  a &  = 
0.001,  and  that  the  covariance  matrices  for  the  Brownian  motions  in  the  INS 
dynamic  equations,  £&  and  E„,  are  diagonal.  To  be  more  specific,  £&  =  10~6/  and 
E„  =  ICG4/,  where  /  is  the  identity  matrix  of  the  right  size.  The  time  step  we 
chose  for  the  approximation  of  the  stochastic  differential  equation  is  h  =  50  ms 
and  the  Gaussian  random  vector  generated  in  each  step  has  the  covariance  matrix 
E/,  =  /iE,  where  E  is  the  covariance  matrix  of  the  combination  of  all  Brownian 
motions  in  the  dynamics.  The  observation  equation  is  given  in  (5.2),  where  yt  is 
one  component  of  the  observation  vector.  The  dimension  of  the  observation  vector 
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Estimated  z  component  versus  actual  z  component 


Figure  5.9:  Comparison  of  the  estimated  and  actual  z  component  for  three  differ¬ 
ent  methods,  EKF,  particle  filtering,  and  projection  particle  filtering.  For  t  <  100, 
the  number  of  satellites  is  6,  for  100  <  t  <  400,  the  number  of  satellites  is  3,  and 
for  t  >  400,  the  number  of  satellites  is  4. 


is  the  same  as  the  number  of  available  satellites.  In  (5.2)  the  observation  is  given  as 
a  function  of  the  position  in  the  ECEF  rectangular  coordinate  system.  Therefore, 
to  be  able  to  write  down  the  observation  equation  as  a  function  of  the  state  of  the 
system,  one  needs  to  use  the  transform  in  (5.3). 

For  this  simulation  we  simply  chose  an  11  dimensional  Gaussian  density  for 
the  projection  particle  filtering.  This  choice  of  density  makes  the  random  vector 
generation  easy  and  computationally  affordable.  To  be  able  to  use  the  projection 
particle  filtering,  we  used  maximum  likelihood  estimation  of  the  parameters  of  the 
Guassian  density  before  and  after  Bayes’  correction. 

In  this  simulation,  we  used  two  NovAtel  RT-21  GPS  receivers  to  collect  the  nav¬ 
igation  data  on  April  2,  2000.  From  the  collected  data,  we  extracted  the  position 


1RT-2  is  the  trademark  of  NovAtel  Incorporated. 


Time  (s) 


Figure  5.10:  The  estimation  error  for  the  platform  position  for  three  different 
methods,  EKF,  particle  filtering,  and  projection  particle  filtering.  For  t  <  100,  the 
number  of  satellites  is  6,  for  100  <  t  <  400,  the  number  of  satellites  is  3,  and  for 
t  >  400,  the  number  of  satellites  is  4. 


Estimation  error  for  platform  position 


Time  (s) 


Figure  5.11:  Detail  of  Figure  5.10,  where  the  difference  between  the  projection 
particle  filtering  method  and  the  particle  filtering  method  is  clear. 
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information  of  the  satellites,  the  pseudo  range,  and  the  carrier  phase  measurement 
noise  powers  for  the  LI  frequency.  Using  the  collected  information  we  generated 
the  pseudo  range  and  the  carrier  phase  data  for  one  static  and  one  moving  receiver 
(base  and  rover,  respectively).  Here  we  assume  for  the  carrier  phase  measurement 
the  integer  ambiguity  problem  is  already  solved.  The  movement  of  the  INS/GPS 
platform  was  simulation  based  and  the  measurement  data  measured  by  the  ac¬ 
celerometers,  the  gyros,  the  GPS  pseudo  range,  and  the  GPS  carrier  phase  data 
were  generated  according  to  that  movement. 

In  the  simulation  the  GPS  receiver  starts  with  6  satellites.  At  time  t  =  100, 
the  receiver  looses  3  satellites,  and  it  gains  one  satellite  at  t  —  400.  We  want  to 
emphasize  that  for  instantaneous  stand  alone  positioning  GPS  requires  at  least  4 
satellites.  Figures  5. 7-5.9  show  the  actual  and  estimated  x,  y,  and  z  components  of 
the  position  of  the  platform  in  ECEF  rectangular  system  of  coordinate.  The  esti¬ 
mates  are  given  by  three  methods,  EKF,  particle  filtering,  and  projection  particle 
filtering.  The  error  of  these  three  methods  are  plotted  in  Figure  5.10.  From  this 
figure,  it  can  easily  be  seen  that  EKF  fails  to  give  an  acceptable  estimate  of  the 
position  when  the  number  of  satellites  in  view  is  below  four.  Unlike  EKF,  particle 
filtering  and  projection  particle  filtering  are  successful  in  providing  a  reasonable 
estimate  of  position.  Figure  5.11  is  the  repeated  version  of  Figure  5.10  with  an 
emphasis  on  the  comparison  of  the  errors  between  particle  filtering  and  projection 
particle  filtering.  For  the  same  number  of  particles,  here  500,  the  error  of  the  esti¬ 
mate  given  by  the  projection  particle  filtering  is  smaller  than  the  error  for  particle 
filtering.  Finally  we  should  mention  that  whenever  the  number  of  visible  satellites 
is  more  than  four,  EKF  can  provide  a  very  good  estimate  of  the  position. 
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Chapter  6 


Particle  Filtering  for  a  Family  of 
Mixture  Densities 


In  Chapter  4  we  introduced  a  new  projection  particle  filtering  method  for  an  ex¬ 
ponential  family  of  densities.  We  proved  that  if  a  family  of  densities  exists  that 
is  close  to  the  true  conditional  density,  then  the  error  of  the  estimate  given  by 
projection  particle  filtering  can  be  bounded  and  this  bound  depends  on  the  choice 
of  the  specific  exponential  family.  Finding  such  a  family  is  not  an  easy  task.  This 
fact  was  a  motivation  for  us  to  study  particle  filters  for  a  family  of  mixture  densi¬ 
ties.  Here  we  assume  that  the  true  conditional  density  is  approximated  by  a  linear 
combination  of  a  finite  number  of  density  functions.  We  can  extend  the  result  of 
this  chapter  to  the  approximation  of  the  true  conditional  density  by  a  set  of  basis 
functions.  Using  this  assumption,  we  replace  the  empirical  distribution  in  [40]  with 
an  estimate  that  lies  in  the  family.  In  Theorem  6.1.8  we  show  results  similar  to 
the  ones  presented  in  Chapter  4. 
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6.1  Projection  Particle  Filtering  for  a  Family  of 
Mixture  Densities 

We  start  this  section  with  the  definition  of  a  family  of  mixture  densities. 

Definition  6.1.1  Let  {ci,  •  •  • ,  cp}  be  a  set  of  densities  defined  on  lZn ,  mid  6  G  1ZP. 
Then 

Si  =  {p{-,0)  =  0Tc(-),  s.t.^Oi  =  1,0.;  >  0,i  =  1,  •  •  -,p} 

i=  1 

is  called  a  family  of  mixture  densities,  where  6  =  (6fi,  •  •  • ,  9P)T  and  c  =  (ci,  •  •  • ,  cp)T . 

For  family  Si,  function  u(-),  and  random  vector  x  distributed  according  to 
v 

#jCj(x )  we  have 

1=1 

P 

Eu(x)  =  9iEiu(x ), 

i= 1 

where  Et(-)  is  the  expectation  with  respect  to  the  density  q.  In  particular  if 
u(-)  =  c(-),  we  have 

Ec(x)  =  [56, 

where  [3  is  a  p  x  p  matrix,  and  its  ij  element,  / 3^-  =  /  Cj(x)cj(x)rfx.  Here,  we 
assume  that  exists.  Therefore,  [5  is  positive  definite. 

If  x1,  •  •  • ,  xN ,  are  i.i.d.  random  vectors  distributed  according  to  p(-,  9),  then  we 
define  the  estimate  of  9,  9,  as  follows: 

1  N 

e=p 

JV  i=  1 

We  know  that 

9  =  (3~lEc{x), 

therefore,  9  is  an  unbiased  estimate  of  9.  From  the  strong  law  of  large  numbers  it 
is  clear  that  9  — >  9  as  N  — >  oo  with  probability  one. 
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We  have 


0-6  =  P  1  ^c(x)  -  , 

therefore,  the  error  estimate  for  the  parameter  6  can  be  bounded  as  follows: 

1  1  N  i  i  N 

— B||Bc(x)  -  -  £  c(x’)||  <  S||9  -  #||  <  — S||Sc(x)  -  -  £c(x')||, 

^max  i= i  ^min  ^  ^=1 

where  Amax  and  Amm  are  the  maximum  and  minimum  eigenvalues  of  the  matrix  /?, 
respectively.  It  is  very  reasonable  to  assume  that  the  variance  of  c(x)  under  p(-,  6) 
is  hnite,  then  3A  >  0  s.t.  £'||c(x)  —  A  c(x*)||  < 

If  6\  and  are  two  different  parameters  satisfying  the  condition  in  Definition 
6.1.1,  we  have 


Eq1  c(x)  -  Edo_  c(x) 


therefore, 


/  (0fc(x))  c (x)dx  -  /  (02  c(x))  c(x)dx 

m  -  o2)  , 


Amin 1 1 01  -  02  ||  <  ||£'01c(x)  -  i?02c(x)||  <  Amax||0i  -  02||,  (6.1) 

where  Equ(?l)  =  f  u(x)0Tc(x)dx.  With  this  introductory  explanation,  we  are  ready 
to  introduce  particle  hltering  for  a  family  of  mixture  densities. 

Algorithm  6.1.2  Particle  Filtering  for  a  Family  of  Mixture  Densities. 

•  Step  1  .  Initialization 

o  Sample  xj,  •  •  • ,  Xq",  N  i.i.d.  random  vectors  with  the  density,  po(x). 


•  Step  2  .  Diffusion 
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❖  Find  x^+1,---,  x^+1  from  the  given  x*,  x^7  using  the  dynamic 

rule: 


d~x.t  =  f 't(x.t)dt  +  Gt(xt)dwt,  ir  <  t  <  (i  +  l)r 

•  Step  3  .  Projection 

1  N 
1=1 

where  Q  is  to  make  sure  that  0  satisfies  the  conditions  in  Definition  6.1.1,  in 
particular  this  function  can  be  chosen  as  follows: 

WFT  *f  ll«+ll^0  (62) 

0  otherwise 

where  ||  •  ||i  is  the  regular  norm  one,  and  (-)+  =  max(-,0). 

•  Step  4  ■  Use  Bayes  ’  Rule 

2  _r(f  %+D-c(x)frn+i(x 

n+1  v  j$(n+  D-c(x)*n+1(x) 

•  57ep  5  .  Resample 

o  Sample  x*+1,  •  •  • ,  x^_,  according  to  p(x,  #n+i). 

•  Step  6  .  n  <—  n  +  1;  go  to  Step  (2). 

In  the  rest  of  this  section  we  will  prove  that  under  certain  conditions,  stated 
later,  the  error  of  the  estimate  associated  to  the  conditional  density  given  by 
Algorithm  6.1.2  can  be  bounded.  In  this  chapter  we  use  the  same  notion  for  the 
probability  spaces  that  where  used  in  Chapter  4. 

To  show  our  main  results  in  this  chapter  we  need  the  following  assumptions. 
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A  6.1.3  For  the  density  in  (3.6)  there  exists  a  family  of  densities  Si  such  that 
Vt  G  [0,T],  \/u  G  J-kn  where  ||^||i  =  1  and  e  >  0  such  that 


E\\EPt(u(x))  -  Eet(u(x))\\  <  e  . 

A  6.1.4  For  6*n-  in  (A6.1.3)  and  \&n(x),  3\&*(x)  =  aTc(x)  and  ||a||i 
that 


W9  where  ||#||i  =  1  and  Vtt(-)  G  EkK,  3e  >  0  such  that 
,  Eg^Jx)u(x.)  Eg'&*(x.)u(x)  .. 


E  ||- 


EgA/Jx) 


Ee^nix) 


<  e. 


Vtt(-)  G  Ek K,  3e  >  0  such  that 

Eg *  \&*(x)u(x)  Ep  _  ^n(x)u(x) 


E  ||- 


Eg*  \I/*(x)  En  T„(x) 


<  e. 


1  n\ 


Fact  6.1.5  V#i,#2  where  ||#i||i  =  1  and  || 6>2 1| i  =  1  we  have 
Ee  i^n(x)c(x)  F,2^;(x)c(x) 

~e^T) - ^*;(X)  - M ll"1"'211 

for  some  M  >  0. 

Proof:  Let  matrix  Aa  and  vector  ba  be  such  that 

Eg^>*n (x)c(x)  =  /  oTc(x)dTc(x)c(x)dx  =  Aa9 ,  and  similarly  Eg^* 

Then, 


E8l^n(x)c(x)  E02^*(x)c(x) 

AA 

Aae2 

Mi 

M2 

< 

Aa9 1 

Aa9  2 

+ 

A  02  A  02 

Mi 

Ml 

Ml  bae  2 

-  Mi 


<  M\\91-92\\, 
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(6.3) 
=  1  such 


:)  =  b  9. 


where  M  =  max  is  a  finite  constant. 

a, 6 1,02  Ml  ba6iba62 

O 

Assume  that  x* ,  •  •  • ,  in  Step  2  of  Algorithm  6.1.2  are  distributed  according 
to  p(-,  9n).  We  also  assume  that 

E\\Eznc(x.)  -  Ee*c(x)\\  <6  .  (6.4) 

At  the  end  of  the  time  interval  [nr,  (n  +  l)r]  we  have 

E 1 1 EE^ c (xnTS ( (n  +  l)r))  -  EE0*c(xnTjS((n  +  l)r))|| 

=  ^11  fc(x)p(x,  (n  +  l)r,  s,  nr)(p(s,  9n)  -  p(s,  9*))dxds\\ 

=  E\\  f  c(x)p(x,  (n  +  1  )r,  s,  nr)cT(s)(0„  -  0;)dxds|| 

<  A||  0re  —  0*||  /  ||c(x)p(x,  (n  +  l)r,  s,  nr)cT(s)||dxds 
<^11^ -0*||  , 

where  p(x,  (n+  l)r,  s,  nr)  is  the  transition  probability  from  state  s  to  state  x  in  the 
time  interval  [nr,  (n  +  l)r],  and  Li  —  f  f  ||c(x)p(x,  (n  +  l)r,  s,  nr)cT(s)||dxds  >  0 
is  a  constant,  possibly  depending  on  n.  Therefore,  using  (6.1)  3/1 1  >  0  s.t. 

E 1 1 EE^ c (xnT;S ( (n  +  l)r))  -  AAe.c(x„TjS((n  +  l)r))||  <  R\S.  (6.5) 

On  the  other  hand,  from  (4.4)  we  have 

1  N  k' 

E\\EE$nc(xnT:S((n  +  l)r))  -  —  ^ c(xnTjSl((n  +  l)r))||  <  Kh 2  +  — ^  ,  (6.6) 

i=  1 

where  s*  =  x^. 

From  Assumption  (A6.1.3)  we  have 

E 1 1  EEq*  c(xnTjS  ( (n  +  l)r ))  -  EPnc(xnTtS((n  +  l)r))|| 

=  A||/q(s)(p(s,0*)  -pn(s))ds||  (6-7) 

<  e  , 
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where  q(s)  =  /  c(x)p(x,  (n  +  1  )r,  s,  nr)dx  is  assumed  to  be  in  1  . 

We  have 

EEPnc(xnTS((n  +  l)r))  =  Ep  c(x). 

(n+1) 

Also  from  Assumption  (A6.1.3)  we  know  that  39*n+1y  s.t 

E\\Ee*  c(x)  -  Ap  c(x)||  <  e-  (6-8) 

(n+1)  (n+1) 

Therefore,  from  (6.5),  (6.6),  (6.7),  and  (6.8)  we  get 

]  N  v 

^H^(W)-C(X)  ~~  N  5Ic(X^si((n  +  1)T))H  -  KlS  +  2e  +  Kh 2  +  ^177-  (6-9) 

Z=1 

This  implies  that 

BIW  -  /3_1(sf  E  c(S„r,,.((»  +  l)r)))|| 

i-i  (6.10) 

<  1/ Amin(7ii<5  +  2e  +  A  h2  +  ^72  )• 

The  following  fact  is  needed  for  proof  of  the  theorem  that  will  be  presented 
later. 

Fact  6.1.6  For  positive  random  vector  a  G  1ZP ,  assume  ||a||i  =  1.  Also  assume 
that  (3  G  1ZP  is  a  random  vector  such  that  (/ 3)+  7^  0.  Then,  if 

E\\ot  —  (3\\  <  e, 

then 

E\\a  —  G{(3)  ||  <  ke, 
where  k  >  0  possibly  depending  on  p. 

Proof:  a  is  a  positive  vector,  therefore,  E\\a  —  (3\\  <  e  implies  A||a  —  (/3)  +  ||  <  e. 
Since  all  norms  are  equivalent  in  finite  spaces,  3L  >  0  such  that 

_ E\\a  -  (/?)+||i  <  Le. 

lrThis  is  a  very  mild  condition. 
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Therefore, 


E |  IKOTr-ll  <Le. 


On  the  other  hand,  we  have 


a  — 


(Q+ 

IKO+II 


ill  <  ll«-(/5)+||  +  l|(/3)+- 


IKO+lli 


<  ll«-(/5)+ll  +  S17  I  ll(/5)+lli-l 

{PY 


Therefore, 


where  k  —  L  +  1. 


£||o- 


<  fee, 


o 


In  Algorithm  6.1.2  we  know  0(n+1)-  =  G(f3  1(jj  J2  c(xr)rsi((n  +  l)r)))),  there 


i=  1 


fore,  by  using  Fact  6.1.6  and  (6.10),  we  can  conclude  that  3L2  >  0  such  that 


E||#(„+1)-  -  «(»+!)-  II  <  L2(KiS  +  2(  +  Kh 2  +  Ah). 


(6.11) 


We  assume  that  c(-)  G  A-fcre,  therefore,  from  Assumption  (A6.1.4),  Fact  6.1.5,  and 
(6.11)  we  have 


E 


0+1) 


'Pn+l(x)c(x) 


/•;- 

e 


0+1) 


'E'n+i(x)c(x) 


0+1) 


^n+l(x) 


e 


0+1) 


^n+l(x) 


(6.12) 


<Ci^  +  C2e  +  C3^  +  C4iV-1/2 

for  some  Ci,  C2,  C3,  C4  >  0. 

On  the  other  hand,  from  Assumption  (A6.1.4)  we  have 


E 


Eg *  '£„+l(x)c(x) 

Q  +  l)~ 


Eg*  VlW 

(n+l) 


Q+l)~ 


Ep  4'„  +  i(x) 

(n+l) 


=  E 


Eg*  4>„+l(x)u(x) 

p+l)~ _ 

Eg*  ®n+l(x) 

(n+l) 


-  E 


P(n+1) 


<  2e. 


(6.13) 
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In  Step  4  of  Algorithm  6.1.2  we  have 


'  E- 


0n+i  =  Q 


(™+i)- 


c(x)'hn+i(x) 


Ez 


V+i)- 


^n+l(x) 


therefore,  using  (6.12),  (6.13),  and  Fact  6.1.6  we  can  conclude  that 


EWEe(n+1)cW  -  ^(„+1)c(x)ll  <  M  +  P*e  +  M2  +  PaN  1/2 , 

for  some  positive  pi,  p2,  Ps,  Pa- 

We  summarize  the  results  of  this  section  in  the  following  theorem. 


Lemma  6.1.7  For  System  (3.3)  assume  (A3. 1.1),  (A3. 1.2),  (A6.1.3),  and 
(A6.1.4).  We  also  assume  c(-)  e  and  the  conditions  in  Theorem  4.1.4  with 
c(x)  replacing  u(x) .  Then  in  Algorithm  6.1.2  with  approximation  (4-3),  if 


EWEf^x)  -  E^c(x)\\  <  S 
then  3g7(,  g7),  o’),  gf  positive  such  that 

B||%(n+i)c(x)  -  %+„c(x)||  <  JIS  +  +  £h2  +  gJJV1'2. 

Lemma  6.1.7  is  the  building  block  for  our  main  result  in  the  next  theorem. 


Theorem  6.1.8  For  System  (3.3)  assume  (A3. 1.1),  (A3. 1.2),  (A6.1.3),  and 
(A6.1.4).  We  also  assume  c(-)  e  3FkK  and  the  conditions  in  Theorem  4.1.4  with 
c(x)  replacing  u(x) .  Then  in  Algorithm  6.1.2  with  approximation  (4-3),  if 

S||S&c(x)-£„;c(x)||<«S 

then  for  all  t  G  [0,  T],  g2,  g 3,  £4  positive  such  that 

E\\Efic(x)  -  Eg* c(x)||  <  QiS  +  g2e  +  Qzh2  +  g4N~1/2. 


79 


Here  we  would  like  to  make  some  remarks. 


•  In  Theorem  6.1.8  four  different  factors  can  increase  the  accuracy  of  the  esti¬ 
mation  method: 

—  N:  the  number  of  particles.  When  N  — >  oo  the  error  due  to  the 
limited  number  of  particles  disappears. 

—  h:  the  step  size  in  the  solution  of  the  stochastic  differential  equation. 
If  instead  of  a  differential  equation  the  system  dynamics  is  given  by  a 
difference  equation  this  error  disappears.  Also,  when  h  — >  0  the  error 
due  to  the  approximate  solution  for  the  stochastic  differential  equation 
goes  to  zero. 

—  e:  the  closeness  of  the  true  conditional  density  to  the  family.  A  smaller 
e  means  a  more  accurate  family  of  densities. 

—  5:  the  initial  estimate.  It  is  clear  that  a  better  initial  estimate  of  the 
density  enhances  the  estimate  of  the  density  for  the  time  t  E  [0,  T] 

•  An  immediate  result  of  Theorem  6.1.8  can  be  summarized  as  follows: 

Corollary  6.1.9  For  System  (3.3)  assume  (A3. 1.1),  (A3. 1.2),  (A6.1.3), 
and  (A6.1.4).  We  also  assume  the  conditions  in  Theorem  4.1.4  with  c(x) 
replacing  w(x) .  Then  in  Algorithm  6.1.2  with  approximation  (4-3),  if 

E\\e0  -  e*0\\  <  5 

then  zki,  t2,  £3,  £4  positive  such  that 

—  0)  ||  <  +  L2e  +  £3 hf  +  L4N  1//2, 

for  all  t  E  [0,  T] . 
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6.2  Discussion 


In  Chapter  4  we  used  an  exponential  family  of  densities  for  approximating  the 
conditional  density.  We  have  used  this  family  (in  the  context  of  projection  particle 
filtering)  for  position  estimation  in  an  integrated  INS/GPS  (Chapter  5),  and  also 
for  integer  ambiguity  resolution  for  a  carrier  phase  differential  GPS  (Chapter  7). 
Although  in  both  cases  we  have  been  able  to  achieve  very  good  results,  applying 
projection  particle  filtering  for  an  exponential  family  of  densities  does  not  seem  to 
be  a  trivial  task  for  general  cases,  where  we  don’t  have  any  idea  about  suitable 
exponential  family.  In  fact,  finding  the  proper  exponential  family  for  a  specific 
problem  is  quite  challenging  [11], 

In  this  Chapter,  we  have  chosen  a  mixture  of  densities  to  approximate  the 
conditional  density.  The  components  of  this  mixture  may  be  viewed  as  a  type  of 
basis  functions.  In  [15]  an  approach  different  but  close  to  our  approach  was  used 
for  a  tracking  problem.  In  that  approach,  the  components  of  the  family  of  mixture 
densities  are  allowed  to  change.  The  new  components  are  calculated  according  to 
the  discrete  time  dynamics.  Using  the  same  method  for  nonlinear  continuous  time 
dynamics  is  not  efficient,  because  the  conditional  density  of  the  state  given  the 
initial  condition  should  be  calculated  in  order  to  find  the  new  components  of  the 
family  of  mixture  densities.  This  is  equivalent  to  solving  the  forward  Kolmogorov 
equation. 

In  our  future  work  we  intend  to  use  the  method  introduced  in  this  chapter 
for  position  estimation  in  an  integrated  INS/GPS.  We  expect  that  a  performance 
similar  to  the  one  in  Chapter  5  can  be  achieved,  with  lower  computational  burden. 
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Chapter  7 


Integer  Ambiguity  Resolution 
Using  Particle  Filtering 


Wherever  possible,  using  differential  GPS  allows  users  to  have  a  more  accurate 
measurement.  In  fact,  a  good  portion  of  the  positioning  error  can  be  removed 
from  the  estimation  using  this  method  [53].  This  is  due  to  the  fact  that  the 
error  in  GPS  navigation  data  has  a  strong  spatial  correlation,  and  this  error  can 
be  removed  by  comparison  of  measurements  from  two  receivers  that  are  relatively 
close  to  each  other.  A  significant  improvement  in  positioning  accuracy  is  possible  if 
one  can  measure  the  carrier  phase  of  the  GPS  signal.  With  today’s  technology  it  is 
possible  to  measure  the  phase  of  the  carrier  within  10-3  modulo  an  integer  number 
of  full  cycles  [3].  Unfortunately,  for  positioning  purposes  this  is  not  enough  and  one 
needs  the  exact  phase  difference  between  the  transmitted  and  received  signal  to 
estimate  the  position.  As  mentioned  in  previous  chapters,  the  difference  between 
the  measured  and  the  actual  phase,  an  unknown  integer  times  2ir,  is  called  integer 
ambiguity  [28].  Resolving  this  ambiguity  has  been  shown  to  be  quite  challenging. 

The  available  integer  ambiguity  resolution  methods  are  mostly  based  on  a  rough 
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estimation  of  the  integer  ambiguity  and  a  search  method  to  find  the  correct  integer 
value  [3].  In  the  LAMBDA  method  [52],  using  a  least  square  estimation  technique, 
first  a  float  solution  for  the  integer  ambiguity  is  found.  Then  through  a  search 
method  the  integer  vector  that  minimizes  the  variance  of  the  error  is  estimated. 
If  the  covariance  matrix  associated  to  the  integer  solution  is  diagonal,  the  integer 
vector  that  minimizes  the  variance  of  the  error  is  an  integer  vector  whose  elements 
are  closest  to  the  elements  of  the  float  solution  vector.  In  practical  problems  that  is 
not  the  case,  therefore,  a  search  for  the  integer  solution  is  unavoidable.  In  Figure 
7.1  a  two  dimensional  integer  least  square  problem  is  shown.  In  this  figure  the 
float  solution  is  shown  by  x  and  the  surfaces  with  the  same  error  are  shown  by 
solid  lines.  It  can  be  seen  that  the  nearest  integer  vector  is  not  the  integer  vector 
that  minimizes  the  error.  In  a  high  dimensional  problem  finding  the  solution  for 
the  integer  least  square  problem  is  quite  challenging  and  the  search  space  for  the 
solution  could  be  quite  large. 

The  idea  in  the  LAMBDA  method  is  to  find  a  transformation  that  maps  integer 
vectors  to  integer  vectors  and  at  the  same  time  maps  the  covariance  matrix  to  a 
matrix  that  is  diagonal  or  dominantly  diagonal.  Although  finding  this  transfor¬ 
mation  has  been  shown  to  be  NP  complete,  a  suboptimal  implementation  of  this 
method  is  proven  to  reduce  the  size  of  the  search  space  effectively  [52], 

In  our  method,  we  first  approximate  the  conditional  probability  density  of  the 
position  of  the  rover  (mobile  GPS  receiver)  given  the  double  difference  measure¬ 
ment  for  the  pseudo  range  observable.  This  density  is  used  for  the  initialization 
that  leads  to  the  conditional  pmf  of  the  integer  ambiguity  given  the  double  differ¬ 
ence  carrier  phase  measurements. 

In  the  initialization  part  we  use  particle  filtering  as  the  tool  for  the  approxima- 
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Figure  7.1:  Example  where  the  nearest  integer  vector  and  the  integer  vector  that 
minimizes  the  error  are  far  apart. 


tion  of  the  conditional  distribution.  To  find  the  approximate  conditional  pmf  of 
the  integer  ambiguities,  given  the  double  difference  carrier  phase  measurements, 
we  use  a  modified  version  of  particle  filtering.  Since  the  set  of  the  integer  ambigu¬ 
ities  is  a  discrete  set  and  the  size  of  this  set/space  is  large,  using  regular  particle 
filtering  might  not  give  the  proper  result.  To  overcome  this  problem,  instead  of 
the  empirical  pmf  for  the  particles,  we  approximate  the  conditional  pmf  with  an 
exponential  form,  in  particular  with  a  Gaussian  shape.  The  filtering  method  here 
is  very  similar  to  the  regular  particle  filtering  method  explained  above.  The  only 
difference  is  that  after  applying  Bayes’  Rule,  we  use  MLE  to  find  the  parameters 
of  the  exponential  probability  mass  function.  This  pmf  is  used  for  generating  new 
particles.  This  is  analogous  to  the  method  presented  in  Chapter  4. 

Since  the  noise  power  in  the  carrier  phase  measurement  is  very  small  compared 
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to  the  pseudo  range  measurements  [53],  and  for  practical  purposes  the  number  of 
particles  is  small  compared  to  the  size  of  the  integer  set,  it  is  very  likely  that  one 
integer  vector  attracts  all  the  particles  and  ends  up  with  probability  equal  to  one. 
To  avoid  this  problem  we  start  the  algorithm  assuming  high  power  noise  for  the 
carrier  phase  measurement,  and  as  time  increases  we  reduce  the  noise  power.  This 
technique  is  similar  to  simulated  annealing  [54]  which  is  widely  used  in  stochastic 
optimization  techniques.  Reducing  the  noise  power  is  like  the  cooling  process  in 
the  simulated  annealing  method. 

In  Chapter  5  we  studied  the  problem  of  position  estimation  in  the  presence  of  an 
integer  uncertainty  by  augmenting  the  state  to  include  the  unknown  integer  vector. 
We  used  the  following  system  of  equations  for  a  moving  object  with  nonlinear 
dynamics  and  observations  similar  to  carrier  phase  differential  GPS. 

x„+i  =  f„(xn)  +  Gn(xn)wn 

yn  =  hn(xn)  +  Jnz  +  v„, 

where  z,  the  integer  ambiguity,  is  a  random  integer  vector,  i.e.  z  G  Zm  and  Jn 
has  the  proper  dimension.  Vector  z  is  assumed  to  be  constant  in  time.  One  way 
of  treating  the  integer  ambiguity  is  augmenting  the  state  x  with  the  integer  z.  In 
this  case, we  have 


y n  hn(xn)  T  Jnzn  T  vn  . 


We  used  particle  filtering  to  estimate  the  integer  ambiguity  as  well  as  the  po¬ 
sition  of  a  moving  object  in  a  two  dimensional  space.  Although  the  results  there 
were  reasonably  good,  we  had  to  use  a  very  large  number  of  particles  for  better 
estimation  results.  In  this  Chapter  we  use  another  approach.  We  use  the  results 
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of  Theorems  4.1.6  and  4.2.7  and  we  apply  a  method  similar  to  Algorithm  4.1.1. 
In  this  approach,  first  we  estimate  the  integer  uncertainty  and  then  we  use  the 
estimated  integer  for  accurate  positioning.  The  details  of  this  approach  are  given 
in  the  following  sections. 


7.1  Rationale 


In  Algorithm  4.1.1  we  proposed  a  particle  filtering  method  for  exponential  families 
of  densities.  Here  we  show  that,  that  algorithm  is  applicable  to  integer  ambiguity 
estimation.  To  support  our  claim  we  will  give  some  simulation  results  with  GPS 
data  in  the  following  sections,  but  we  also  want  to  justify  why  our  claim  is  reason¬ 
able.  For  the  sake  of  argument,  we  assume  that  the  ambiguity  is  real,  i.e.  n  e  7 Zm, 
so  we  can  assign  a  probability  density  function  to  it.  In  the  rest  of  this  section 
we  go  through  an  approximate  calculation  of  the  probability  density  function  of 
the  real  valued  ambiguity  given  the  observation  and  we  show  that  the  Gaussian 
density  is  a  good  candidate  for  the  exponential  family  of  densities. 

Consider  the  measurement  model  in  (2.1)  and  (2.2).  The  observation  is  the 
result  of  a  double  differencing  from  a  possibly  moving  rover  and  a  static  base.  We 
assume  that  during  the  observation  no  cycle  slip  happens.  We  seek  to  estimate  the 
conditional  probability  density  of  the  real  valued  ambiguity  given  the  observations 
up  to  time  n  +  1 ,  i.e. 


p(n|<f>"+1,  P”+1) 


P{4)n+ 1?  Pn+ 1 

n,  P]n)p(n 

In  P(4>n+ 1 1  Pn+1  \ 

n,  <Fjb  Pf)p( n 

4>  jl  ,  Pf)dn 

(7.2) 


where  <f>"  =  {0i,  02,  •  •  • ,  <f>n}  and  P”  =  {pi,P2,  •  •  •  ,pn}  are  the  observation  sets  up 
to  and  including  time  n.  We  want  to  show  that  if  p(n|$”,P”)  is  Gaussian,  then 
p(n|<&i+1,  P"+1)  is  approximately  Gaussian.  To  show  this,  we  only  need  to  show 


that  p(0n+i,pn+i|n,  <f>",  P")  has  the  following  form: 

p(0n+i,Pn+i|n,  $1,  Pi)  =  aexp(-i(n  -  /3)Tr_1(n  -  (3)), 

for  some  a ,  /3,  and  T  that  do  not  depend  on  n. 

For  our  integer  ambiguity  resolution  method,  we  assume  that  no  information 
about  the  dynamics  of  the  receiver  is  available.  Therefore, 


P(0n+l,Pn+l|n,  $ i,Pi ) 


p(0n+l|pn+l,  n,  $7,  P?)p(pn+i\n,  Pf) 
p((f)n+l\Pn+l,  n,  $1,  iT)p(p„+l). 


For  p(<f)n+i\pn+i,  n,  <f>™,  P"),  we  have 

p(0„+i|pn+i,n,$7,P1n)  =  /p(0n+1|pn+i,n,$7,P1n,xn+i) 

p(xn+i|pn+1,  n,  <f>j\  P”)dxn+i  (7.3) 

=  /  p(0n+i  |n,  xn+i)p(xn+1  |pn+i)dxn+i . 

From  (2.2),  we  get 

P{<t>n+i |n,xn+i)  =  Ki  exp  (— |(0„,+i  —  p(xn+1)  —  n)T 

V  “  (7-4) 

£/(&*+ 1  -p(xn+i)  —  n)J  , 

where  E^  is  the  covariance  matrix  of  the  double  difference  carrier  phase  observation 
noise,  p(x„+i)  is  the  vector  of  double  difference  true  range,  and  is  a  normalizing 
factor.  Here  we  emphasize  that  the  norm  of  E^  is  very  small,  therefore,  in  (7.3) 
the  contribution  of  the  region  that  is  outside  a  small  neighborhood  of  x*,  the  point 
that  maximizes  the  argument  of  the  exponent  in  (7.4),  is  negligible.  This  justifies 
the  approximation  of  p(0n+i|n,  xn+i)  by  linearization,  i.e.  we  get 


p((j)n+ i|n,  xn+i)  «  Ki  exp  (-§(</>n+1  -  p(x* )  -  n  -  4Axn+i)] 


Ea  {4>n+i  -  p(x*)  -  n  -  HAx 


n+ 1 


(7.5) 


where  xn+i  =  x*  +  Axn+] ,  and  A  = 


dx. 
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On  the  other  hand,  from  (2.1),  we  have 


Pn+ 1  —  P(xr)+l)  +  Gi+1- 

Therefore,  after  linearizing  p(xn+i),  and  using  the  generalized  inverse  of  A,  we  get 

Axn+i  «  ( ATA)~1AT(pn+1  -p(x*))  -  ( ATA)~1ATen+1 . 

Here  we  assume  that  A  is  full  rank,  i.e.  a  sufficient  number  of  satellites  with 
acceptable  geometry  is  available.  Therefore, 

p(Axn+i|pn+i)  «  K2exp (-^(P>  -  Axn+i)TX _1(P>  -  Axn+i)),  (7.6) 

where  D  =  (AT  A)-1  AT  (pn+i  —  p(x*)),  T  =  (AT  A)^1AtT,pA(At  A)^1,  Ep  is  the 
covariance  matrix  of  the  double  difference  code  measurement  noise,  and  k 2  is  a 
normalizing  factor. 

From  (7. 3), (7.5)  and  (7.6),  we  get 

p((pn+i\pn+i,  n,  <f>",  P*)  =  fpi&n+i |n,  Axn+1)p(Axn+1|pn+1)dAxn+i 

~  aexp(— |(n  -  /5)TT_1(n  -  /?)), 

where  a,  /?,  and  T  only  depend  on  the  matrices  E^,  Ep,  and  A,  and  the  vectors 
pn+i,  p(x*),  and  0n+i.  Since  p(pn+i|n,  <f>”,  P")  =  p(pn+i)  does  not  depend  on  n,  we 
can  claim  that  the  conditional  density  p(n|<h”+1,  P"+1)  is  approximately  Gaussian. 

In  the  above,  we  did  not  use  the  fact  that  n  is  a  real  valued  vector.  Therefore, 
by  using  a  similar  argument,  we  can  claim  that,  if  for  an  integer  vector,  n,  the 
pmf,  P(n|<h,]l,  P"),  has  a  Gaussian  shape,  then  the  pmf,  P(n|$”+1,  P”+1),  also  has 
a  Gaussian  shape. 


7.2  Particle  Filtering  for  Gaussian  Shaped  Dis¬ 
tributions 

Using  the  justification  in  Section  7.1,  we  can  replace  the  empirical  distribution  of 
the  particle  filtering  by  an  exponential  family  with  a  Gaussian  shape.  Algorithm 
7.2.1  takes  this  modification  into  account. 

Algorithm  7.2.1  Particle  Filtering  for  a  Gaussian  Shaped  Distribution. 

•  Step  1  .  Initialization 

o  Sample  nj,  •  •  • ,  n^,  N  i.i.d.  random  variable  with  the  distribution, 
^o(n). 

•  Step  2  .  New  measurement  and  Bayes  ’  Ride 

n 

N  £  ^(n)  •  Wn+l.Pn+lk) 
PAr(n|<h”+1,P”+1)  =  — ^ - 

jf  E  Snj  (rin)  •  p(0n+l,Pn+l|nn) 

3= 1 

•  Step  3  .  Find  the  mean  and  the  covariance  estimates  for  PJV(n|<h"+1,  P”+1). 

N 

hn+1  =  ZPN«m+1,P?+1)< 

i=  1 
N 

s»„+,  =  EF»(nj,|*J+I,P1”+1)K-nn)K-nn)1' 

i=l 

•  S7ep  ^  .  Resample 

o  Sample  real  valued  n^+1,  •  •  • ,  h;^+ ,  according  to  p(n|<h"+1,  P”+1). 
where 

p(n|^+1,P"+1)  = 
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exp  (~y-(n  -  nw+1)r  £n*+1  (n  -  n„+1)) 
V/(2vr)mdet(Snn+1) 


•  Step  5  .  n^+1  =  g(n/+1),  for  i  =  1,  •  •  • ,  N.  g(-)  is  a  rounding  function. 


•  Step  6  .  n  <—  n  +  1;  go  to  Step  (2). 


where  <5v(w)  =  1  if  w  =  v,  and  0  otherwise. 

In  Algorithm  7.2.1,  if  the  number  of  particles  is  small,  a  bad  initialization 
causes  significant  estimation  error.  To  overcome  this  problem  one  can  increase  the 
number  of  particles  and/or  choose  the  initialization  carefully.  Increasing  the  num¬ 
ber  of  particles  increases  the  computational  cost  which  is  not  desirable.  Therefore, 
choosing  a  proper  initialization  is  of  great  importance. 

In  the  integer  ambiguity  resolution  problem,  we  first  initialize  the  conditional 
pmf  of  the  integer  ambiguity  using  the  pseudo  range  measurement.  Since  the  noise 
power  of  the  pseudo  range  measurement  is  significantly  larger  than  the  noise  power 
of  the  carrier  phase  measurement,  it  is  very  likely  that  one  of  the  integer  vectors, 
that  has  probability  greater  than  the  others,  ends  up  with  probability  one  and  the 
rest  of  integer  vectors  end  up  with  zero  probability.  To  avoid  this,  we  alter  the 
covariance  matrix  for  the  carrier  phase  measurement  and  the  covariance  matrix  for 
p(n|<h")as  follows: 


+  n  2 1 
Snn  =  (l  +  f)£„. 


(7.7) 


where  a  is  a  constant  coefficient.  The  idea  of  changing  the  covariance  matrices 
is  borrowed  from  the  simulated  annealing  technique.  Similar  to  the  temperature 
decrease  in  that  technique,  here  we  decrease  the  additional  uncertainty  added  to 
the  data  as  time  grows.  The  change  of  the  covariance  matrix  in  (7.7)  is  not  unique, 
but  the  general  form  of  (7.7)  should  be  kept. 

Once  we  have  the  conditional  pmf  of  the  integer  ambiguity  given  the  obser¬ 
vations,  the  estimate  for  integer  ambiguity  can  be  obtained  by  hireling  the  point 
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where  the  conditional  pmf  is  maximum.  This  is  equivalent  to  finding  the  MAP 
estimate  of  the  integer  ambiguity. 

7.3  Simulations  and  Results 

Using  the  data  described  in  Section  5.2.3,  we  generated  the  pseudo  range  and 
carrier  phase  data  for  one  static  and  one  moving  receiver  in  the  same  way  as  was 
performed  in  Section  5.2.3.  To  be  able  to  check  our  method  we  added  an  artificial 
integer  ambiguity  to  the  simulated  data.  We  chose  a  three  dimensional  random 
walk  dynamics  with  nonzero  mean  speed,  (x,  y,  z)^+1  =  (. x ,  y ,  z)^  +  (2, 1, 1)  +  2e„, 
where  en  is  a  three  dimensional  zero  mean  Gaussian  random  vector  with  unit  power. 
The  spatial  and  temporal  units  are  assumed  to  be  meter  and  second,  respectively. 

We  applied  the  method  discussed  in  Section  7.2,  for  different  numbers  of  mea¬ 
sured  epochs.  For  each  of  these  cases  we  simulated  1000  different  trials  and  we 
counted  the  number  of  times  that  the  algorithm  doesn’t  find  the  correct  integer 
vector.  We  call  these  incorrect  outcomes  error  of  the  estimate.  The  results  of  these 
experiments  are  summarized  in  Table  7.1.  For  the  case  where  20  epochs  are  used, 
the  error  is  equal  to  %2.9. 

Given  the  double  difference  pseudo  range  and  double  difference  carrier  phase 
measurements,  the  integer  ambiguity  estimate  given  by  Algorithm  7.2.1  is  a  random 
variable.  Therefore  one  can  run  the  algorithm  several  times  for  the  same  data 
to  confirm  the  estimate.  Using  this  idea,  in  a  separate  experiment  we  ran  the 
algorithm  for  each  set  of  observations  three  times.  If  at  least  two  out  of  three  of 
the  estimated  integer  ambiguities  were  the  same,  we  would  accept  the  repeated 
estimate  as  the  integer  ambiguity  estimate.  If  none  of  the  estimates  were  the  same 
we  would  reject  all  answers.  The  results  of  this  experiment  are  summarized  in 
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No.  of  Epochs 

No.  of  Particles 

No.  of  trials 

Error  % 

2 

5000 

1000 

%  40 

3 

5000 

1000 

%  28 

5 

5000 

1000 

%15.8 

10 

5000 

1000 

%5.8 

20 

5000 

1000 

%2.9 

Table  7.1:  The  percentage  of  error  for  integer  ambiguity  estimation 


No.  of  Epochs 

No.  of  Particles 

No.  of  trials 

Rejection  % 

Error  % 

5 

5000 

3  x  1000 

%2.1 

%13.7 

10 

5000 

3  x  1000 

%0.7 

%1.2 

20 

5000 

3  x  1000 

%0.6 

%0.0 

Table  7.2:  The  percentage  of  error  for  integer  ambiguity  estimation 

Table  7.2.  It  can  be  seen  that  when  a  sufficient  number  of  measured  epochs  is 
available,  the  error  percentage  can  be  reduced  significantly.  It  is  also  seen  that  for 
the  case  with  a  small  number  of  epochs,  the  repeated  trials  don’t  help.  This  is 
because  the  small  number  of  epochs  makes  the  algorithm  adapt  itself  to  the  data. 

There  are  a  few  points  about  our  method  that  we  emphasize  on: 

•  This  method  can  be  applied  to  kinematic  positioning  as  well  as  static  posi¬ 
tioning. 

•  In  this  method  we  do  not  linearize  the  observation  equations,  therefore  the 
correlation  between  the  noise  of  the  different  measurements  is  only  due  to 
the  double  differencing. 
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•  The  use  of  a  conditional  pmf  reduces  the  need  to  make  complicated  searches 
to  resolve  integer  ambiguities,  e.g.  integer  least  squares. 
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Chapter  8 


Detection  of  Abrupt  Changes  in  a 
Nonlinear  Stochastic  System 


In  many  practical  problems  arising  in  quality  control,  fault  detection,  and  integrity 
monitoring,  the  underlying  system  can  be  represented  by  a  parametric  model.  The 
parameters  of  such  models  usually  can  be  categorized  into  two  different  sets.  The 
first  set  contains  the  parameters  that  change  slowly  with  respect  to  time,  for  ex¬ 
ample  the  parameters  that  describe  the  conditional  density  of  position-velocity- 
orientation  in  a  navigation  system  are  of  this  type.  The  second  set  contains  the 
parameters  that  are  subject  to  sudden  changes.  These  sudden  changes  are  the 
results  of  a  failure  in  the  system  dynamic,  malfunctioning  of  measuring  instru¬ 
ments,  or  perhaps  the  result  of  a  change  in  the  state  of  the  system.  We  refer  to 
these  changes  as  sudden  or  abrupt  because  the  time  frame  in  which  these  changes 
happen  is  much  smaller  than  the  response  time  of  the  system  which  is  limited  by 
the  nominal  bandwidth  of  the  system. 

The  abrupt  changes  in  the  system  do  not  need  to  be  catastrophic.  In  fact, 
in  this  dissertation  we  are  interested  in  studying  the  changes  that  degrade  the 
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performance/accuracy /efficiency  of  the  system,  but  do  not  stop  the  system  from 
functioning.  A  monitoring  system  is  responsible  for  detecting  and  isolating  these 
changes. 

Online  detection  of  abrupt  changes  for  linear  dynamical  systems  have  been 
studied  extensively  (qv.  [8]  and  the  references  therein).  Unlike  the  linear  case, 
change  detection  for  nonlinear  dynamical  stochastic  systems  has  not  been  investi¬ 
gated  in  any  depth.  In  the  cases  where  a  nonlinear  system  experiences  a  sudden 
change,  linearization  and  change  detection  methods  for  linear  systems  are  the  main 
tools  for  solving  the  change  detection  problem  (see  [41]  for  example).  The  reason 
for  this  lack  of  interest  is  clear;  even  when  there  is  no  change,  the  estimation  of 
the  state  of  the  system  given  the  observations  results  in  an  infinite  dimensional 
nonlinear  filter;  the  change  in  the  system  can  only  make  the  estimation  problem 
harder. 

As  we  discussed  in  the  previous  chapters,  the  theoretical  results  regarding  the 
convergence  of  the  approximate  conditional  density  given  by  particle  filtering  to 
the  true  conditional  density,  suggests  that  this  method  is  a  useful  approximation 
to  exact  nonlinear  filtering.  We  believe  that  particle  filtering  and  its  modifications 
are  a  starting  point  to  study  change  detection  for  nonlinear  stochastic  systems. 
Here  we  use  the  results  in  Chapter  4  and  we  develop  a  new  change  detection 
method  for  nonlinear  stochastic  systems.  We  show  that  for  nonlinear  systems  the 
computational  complexity  of  the  CUSUM  algorithm  grows  with  respect  to  time, 
therefore,  it  is  inapplicable  in  many  practical  applications.  We  introduce  a  change 
detection  method  based  on  a  likelihood  ratio  test  and  a  new  statistic.  We  show  that 
this  statistic  can  be  calculated  recursively  with  constant  computational  complexity. 

In  Chapter  5  we  showed  that  when  the  number  of  satellites  is  below  a  critical 


95 


number,  linearization  methods  such  as  EKF  result  in  an  unacceptable  position 
error  for  an  integrated  INS/GPS.  We  also  showed  that  the  approximate  nonlinear 
filtering  methods,  projection  particle  filter  in  particular,  are  capable  of  providing 
an  acceptable  estimate  of  the  position  in  the  same  situation. 

In  an  integrated  INS/GPS,  if  the  carrier  phase  of  the  GPS  signal  is  used  for 
positioning,  sudden  changes  of  the  phase  measurement  clue  to  the  cycle  slip  should 
be  detected  to  be  able  to  keep  the  integrity  of  positioning  method  intact.  A  cycle 
slip  happens  when  the  phase  of  the  received  signal  estimated  by  the  phase  lock  loop 
in  the  receiver  has  a  sudden  jump.  If  the  cycle  slip  is  not  detected  and  repaired  the 
position  given  by  an  integrated  INS/GPS  with  a  carrier  phase  receiver  is  no  longer 
reliable.  Therefore,  one  important  aspect  of  an  integrated  INS/GPS  is  to  detect 
such  sudden  changes.  Since,  in  critical  conditions,  linearization  methods  are  not 
capable  of  providing  the  estimate  of  the  position,  in  the  same  setup,  corresponding 
change  detection  methods  are  not  useful  either.  We  used  an  integrated  INS/GPS 
under  critical  conditions  as  an  application  of  our  method.  Since  the  proposed 
change  detection  method  assumes  known  parameters  after  change,  this  application 
should  not  be  considered  a  cycle  slip  detection  method. 

In  this  chapter,  first  we  briefly  define  the  change  detection  problem  and  we 
review  the  CUSUM  algorithm  for  linear  systems  with  additive  changes.  Then  we 
present  a  new  change  detection  method  for  nonlinear  stochastic  systems.  Finally, 
we  present  some  simulation  results  and  we  summarize  the  results  and  future  work. 

8.1  Change  Detection:  Problem  Definition 

On-line  detection  of  a  change  can  be  formulated  as  follows  [8].  Let  y™  =  (yi,y2, 
•  •  • ,  yn}  be  a  sequence  of  observed  random  variables  with  conditional  density 


96 


Pe(yk\yk-1,  ■  ■  •  ,yi).  Before  the  unknown  change  time,  t0,  the  parameter  of  the 
conditional  density,  9,  is  constant  and  equal  to  9q.  After  the  change,  this  param¬ 
eter  is  equal  to  6\ .  In  online  change  detection,  one  is  interested  in  detecting  the 
occurrence  of  such  a  change.  The  exact  time  and  the  estimation  of  the  parameters 
before  and  after  the  change  is  not  required.  In  case  of  multiple  changes,  we  assume 
that  the  changes  are  detected  fast  enough  so  that  in  each  time  instance  only  one 
change  has  to  be  considered.  Online  change  detection  is  performed  by  a  stopping 
rule  [8] 

ta  =  inf{n  :  gn(y\l)  >  A} 

where  A  is  a  threshold,  (gn)n> i  is  a  family  of  functions,  and  ta  is  the  alarm  time, 
i.e.  the  time  when  change  is  detected. 

If  ta  <  to  then  a  false  alarm  has  occurred.  The  criteria  for  choosing  the  param¬ 
eter  A  and  the  family  of  functions  ( gn)n>i  is  to  minimize  the  detection  delay  for 
the  fixed  mean  time  between  false  alarms. 


8.2  Additive  Changes  in  Linear  Dynamical  Sys¬ 
tems 


Consider  the  following  system: 


(8.1) 


rfcrx(M0) 

EkTy(k,t0)  , 

proper  dimension,  and  T x(k,t0) 
and  Ty  (k,  t0 )  are  the  dynamic  profiles  of  the  assumed  changes,  of  dimension  n  <  n 
and  d  <  d,  respectively.  wk  and  vk  are  white  Gaussian  noise,  independent  of  the 


Xfc+l 

Fk^k 

+ 

Yk  = 

HkX-k 

+ 

Vfc 

Hk,  Bfc, 

and  Ek 

are 

matrices 
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initial  condition  x0.  It  is  assumed  thatTx(/c,  f0)  =  0  and  T y(k,to)  =  0  for  k  <  to, 
but  we  do  not  necessarily  have  the  exact  knowledge  of  the  dynamic  profile  and  the 
gain  matrices,  T k  and  Eq..  The  dynamic  profile  of  change  may  be  assumed  known 
or  unknown. 

For  the  case  of  known  parameters  before  and  after  change,  the  CUSUM  [8] 
algorithm  can  be  used,  and  it  is  well  known  that  the  change  detection  method  has 
the  following  form 

=  min {k  >  1  \gk>  A} 

=  max  S!-  (8.2) 

,  IlL,-PP«.j)(e0 
nUrtti)  7 

where  e*  is  the  innovation  process  calculated  using  Kalman  filtering,  assuming 
that  no  change  occurred,  and  p(i,j)  is  the  mean  of  the  innovation  process  at 
time  j  conditioned  on  the  change  occurred  at  time  i.  po  and  pp(.,.)  are  Gaussian 
densities  with  means  0,  and  />(•,•),  respectively.  The  covariance  matrix  for  these 
two  densities  is  the  same  and  is  calculated  using  Kalman  filtering. 

When  the  parameter  after  change  is  not  known,  the  algorithm  that  is  used  for 
the  change  detection  is  the  GLR  test  [55].  In  this  case,  gk  is  calculated  as  follows 

gk  =  max  sup  S (8.3) 

l<7‘<A:Yx,Ty  J 

The  solution  for  (8.3)  is  well  known  and  can  be  found  in  many  references  [8]. 

Similar  to  nonlinear  filtering,  change  detection  for  nonlinear  stochastic  sys¬ 
tems  results  in  an  algorithm  that  is  infinite  dimensional.  Linearization  techniques, 
whenever  applicable,  are  the  main  approximation  tool  for  studying  the  change  de¬ 
tection  problem  for  nonlinear  systems.  In  this  setup,  a  nonlinear  filtering  problem 
is  transformed  to  it  linearized  form  through  EKF  and  then  the  same  algorithms 
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that  are  used  for  the  linear  Gaussian  case  are  used  for  the  change  detection  prob¬ 
lem.  Although  linearization  techniques  are  computationally  efficient,  they  are  not 
always  applicable.  In  the  sections  to  come,  we  propose  a  new  method  based  on 
nonlinear  particle  filtering  that  can  be  used  for  change  detection  for  nonlinear 
stochastic  systems. 


8.3  Nonlinear  Change  Detection: 
Problem  Setup 


(8.4) 


Consider  the  following  nonlinear  system 

xfc+i  =  f£fc(xfe)  +  Glk(xk)wk 
yk  =  hj.fc(xfc)+vfc, 

where  xfc  G  7 Zn,  ynr  G  TZd,  wk  G  lZq  and  vk  G  1Zd  are  white  noise  processes  with 
known  statistics,  and  the  functions  f^(-)  and  h^(-)  and  the  matrix  Glk{-)  have 
the  proper  dimensions.  The  noise  processes  wk  ,  vk,  k  =  0,1,---,  and  the  initial 
condition  x0  are  assumed  independent.  We  assume  that 


^ k 


(8.5) 


0  k  <  to 

1 

i  k  >t0,  i  G  / 

where  /  is  a  countable  index  set.  The  index  0  is  used  for  the  nominal  system  and 
the  system  after  change  belongs  to  a  countable  set  of  systems.  Here,  we  assume 
that  the  set  /  has  only  one  member,  i.e.  we  assume  that  the  parameters  after  the 
change  are  known. 


In  this  setup  Sj  can  be  written  as  follows 


S7  =  In 


p(y?\y{  ,t0>k) 


(8.6) 
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p(xkiY0k;Vi) 


Figure  8.1:  Combination  of  nonlinear  filters  used  in  the  CUSUM  change  detection 
algorithm. 


Writing  (8.6)  in  a  recursive  form  we  get 

p(y}\y(~\t0  =  j)  =  fip(yi\yt\t0  =  j) ,  (8.7) 

i=j 

where  p(yi|(Vi_1,  to  =  j)  can  be  written  as  follows 

p(yi\ylr\t0  =  j)=  I  p(yi\xi)p(xi\y{-%,t0  =  (8-8) 

To  find  pipt^yy1 ,  t0  =  j )  in  (8.8),  one  needs  to  find  an  approximation  for  the 
corresponding  nonlinear  filter.  We  assume  that  this  approximation  is  done  using 
either  particle  filtering  or  projection  particle  filtering  (see  Chapters  3  and  4). 

To  calculate  the  likelihood  ratio  in  (8.6),  we  must  calculate  the  conditional 
densities  of  the  state  given  the  observation  for  two  hypothesis  (changed  occurred 
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at  j  and  change  occurred  after  k).  This  means  that  two  nonlinear  filters  should 
be  implemented  just  to  compare  these  two  hypothesis.  Therefore,  it  is  clear  that 
to  use  an  algorithm  similar  to  (8.2),  k  parallel  nonlinear  filters  should  be  imple¬ 
mented.  In  Figure  8.1,  we  see  that  the  computational  complexity  of  the  CUSUM 
algorithm  grows  linearly  with  respect  to  time.  In  most  applications  this  growth 
is  not  desirable.  One  possible  way  to  approximate  the  CUSUM  algorithm  is  to 
truncate  the  branches  that  are  forked  from  the  main  branch  in  Figure  8.1.  We 
will  explain  this  truncation  procedure  and  its  technical  difficulties  in  the  next  few 
lines. 

Recall  that  the  main  branch  (horizontal)  and  the  branches  forked  from  it  in 
Figure  8.1  represent  a  series  of  nonlinear  filters  with  specific  assumptions  on  the 
change  time.  The  dynamics  and  the  observation  equation  for  all  forked  branches 
are  the  same  and  the  only  difference  is  the  initial  density.  If  the  conditional  density 
of  the  state,  given  the  observation,  for  a  nonlinear  system  with  the  wrong  initial 
density  converges  (in  some  meaningful  way)  to  the  true  conditional  density  (ini¬ 
tialized  by  the  true  initial  density),  we  say  that  the  corresponding  nonlinear  filter 
is  asymptotically  stable  [14], 

For  asymptotically  stable  nonlinear  filters,  the  forked  branches  in  Figure  8.1 
converge  to  a  single  branch,  therefore  there  is  no  need  to  implement  several  parallel 
nonlinear  filters.  In  other  words,  after  each  branching  the  independent  nonlinear 
filter  is  used  for  a  period  of  time  and  then  this  branch  converges  to  the  branches 
that  have  forked  earlier,  i.e.  joins  them.  The  time  needed  for  the  branch  of 
the  independent  nonlinear  filter  to  join  the  other  forked  branches  depends  on  the 
convergence  rate  and  the  target  accuracy  of  the  approximation. 

Although  the  procedure  mentioned  above  can  be  used  for  asymptotically  stable 
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nonlinear  filters,  there  are  several  problems  associated  to  this  method.  The  known 
theoretical  results  for  identifying  asymptotically  stable  filters  is  limited  to  either 
requiring  ergodicity  and  the  compactness  of  the  state  space  [5,  6,  35]  or  very  special 
cases  of  the  observation  equation  [14].  The  rate  of  convergence  of  the  filters  in 
different  branches  is  another  potential  shortcoming  of  the  mentioned  procedure. 
If  the  convergence  rate  is  low  in  comparison  with  the  rate  of  parameter  change  in 
the  system,  then  the  algorithm  cannot  take  advantage  of  this  convergence. 


8.4  Nonlinear  Change  Detection:  Non  Growing 
Computational  Complexity 


In  this  section  we  introduce  a  new  statistic  to  overcome  the  problem  of  growing 
computational  complexity  for  the  change  detection  method.  We  show  that  this 
statistic  can  be  calculated  recursively. 

Consider  the  following  statistic 


Tk  ,  f(^iy',ioefe-,t}) 
‘  n  p(yl\yt\ta>k) 


(8.9) 


For  the  rest  of  this  chapter  we  assume  that,  conditioned  on  change,  the  change 
time,  to,  is  distributed  uniformly,  i.e. 

P(t0  =  i\t0e{j,---,k})  =  \  1=751  .  (8.10) 

0  otherwise 
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p(y},to  e  O', ■■■,k}\yi~1,t0  e  {j,  •••,&}) 
p(y},t0  =  j  I yi~1,t0e{j,-'‘ik})+ 
p(y},t0  =  j  +  i\yr\  t0  e  {j,--,k})+ 

p(y},t0  =  k  \yr\t0e{j,---,k}) 

=  J+ri(p(y}\y{-\to  =  j)+ 

p(y}\yi~\t0  =  j  +  i)+ 

■■■+p(y$\yr\to  =  k)), 

therefore, 

Tk  =  i  POj^f  L.'oe{;/.-.A-}) 

3  P(y*\y{-\t0>k) 

,  /  i  A  uyfiyr^owA 
11  \vfc-J+1  p(y;i^_1,t0>fc)  J  ' 

In  other  words,  Tjc  can  be  written  as  follows 

Ti  =  111  (fcZTTT  §  exp(5*))  •  (8'11) 

The  change  detection  algorithm  based  on  statistic  T*  can  be  presented  as  fol¬ 
lows 

ta  =  minj/c  >  j  |  T*  >  A  or  T*  <  —a},  (8-12) 

where  j  is  the  last  time  that  >  A  or  <  —a,  and  A  >  0  and  a  >  0  are 
chosen  such  that  the  detection  delay  is  minimum  for  a  fixed  mean  time  between 
two  false  alarms.  Using  (8.2)  and  (8.11),  we  try  to  find  a  relation  between  the 
detection  method  (8.12)  and  the  CUSUM  algorithm.  Assume  two  possible  extreme 
cases.  The  first  one  is  the  case  where  Sf  —  c,  Vi  e  {j,  ■  ■  ■ ,  k}.  In  this  case  it  is 


With  this  assumption  we  have 

p(yf\yt\t0e{j,---,k})  = 
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clear  that  Tj  =  Sf  Vi  G  {j,  ■  ■  ■ ,  k},  and  therefore,  the  performance  of  the  two 
methods  with  the  same  thresholds  is  the  same.  In  the  second  case  we  assume 
that  3 i,  l  G  {j,  ■  ■  ■ ,  k}  such  that  Sf  Sf ,  l  ^  i.  Therefore,  it  can  be  seen  that 
Tf  «  Si  —  In  (A;  —  j  +  1),  i.e.  Tj  is  degraded  by  —  In  (k  —  j  +  1).  With  this  simple 
analysis  we  can  conclude  that 

max  S?  -  In (k  -  j  +  1)  <  Tfe  <  max  S,k.  (8.13) 

Therefore,  with  the  same  thresholds  for  both  detection  methods,  (8.13)  can  be 
used  to  find  the  bounds  for  the  performance  of  the  detection  algorithm  in  (8.12) 
with  respect  to  the  CUSUM  algorithm.  We  want  to  emphasize  that  the  thresholds 
used  for  detection  method  (8.12)  need  not  be  the  same  as  the  thresholds  in  the 
CUSUM  algorithm,  in  fact,  they  should  be  optimum  according  to  the  criteria  for 
the  mean  detection  delay  for  the  detection  method  in  (8.12). 

The  main  advantage  of  using  the  statistic  Tj°  over  Sj  is  the  fact  that  T*  can 
be  calculated  recursively  without  growth  in  the  computational  complexity  of  the 
method  with  respect  to  time.  We  can  rewrite  p{y}-\y\^\  to  G  {j,  ■  ■  ■ ,  k})  as  follows 

k 

p(y^\yi~\t0  g  0, — ,  A;»  =  nKy^rVo  e  {j,  •••,&})• 

i=j 

Using  (8.10)  we  have 

p(y*|yr\  *0  e  {j,  ■  ■  ■ ,  k})  =  p(yi,  to  G  {j,  ■  ■  ■ ,  k}\y[-\ to  G  {j,  •  •  • ,  k}) 


=  p{yi,t0  G  {j,  •••,*}  1^1  x,  to  G  {j,  ■■■,k})  + 
p{ y», to  >  i\yr\t0  G  {j,  ■  ■  ■ ,  A;})  (8-14) 

=  |E^yp(y* l^r1,  to  G  {j,  — ,*})  + 
kzjyiP(yi\yly\t0  >  i)  . 
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From  (8.14)  it  is  clear  that  we  need  only  calculate  two  types  of  functions.  These 
two  functions  are  p(yi|(Vi_\  to  £  {j,  •••,*})  and  p(yi\y\~l,  to  >  i ).  To  calculate 
these  two  functions  we  can  use  the  following 

p(yj|^i_1^o  e  O',  —  ,  *}) 

=  Jp(yi\*i,  t0  e  {j,  ■  ■  • ,  i})p{*i\y[~l ,  t0  e  {j,  ■  ■  • ,  i})dxt 

=  iZ7yT/Pi(yi|x*)p(x,;|X_1,  to  =i)dxi  + 

jzjii I Pi(yi\^i)p(^i\ylrl,  t0  e  {j,---,i-i})dxi 
=  i=j+r/pi(y*|xiMxil^_1,  t0  >  i  i)dxi  + 

i=j+r/pi(yi|xi)p(xi|^-1,  *o  e  {j,  i})dx,: 

and 

P(yiW_1,<o  >  i)  =  /p(yi|xj,  t0  >  *)p(xj|^_1,  t0  >  ^)dxj 
=  /Po(y*|xi)p(xi|^"1,i0  >  i  -  1  )dxi  , 

where  po(y*lxi)  and  p\  (y.,|x,)  are  the  conditional  densities  of  the  observation  given 
the  state  of  the  system  before  and  after  the  change,  respectively.  To  calculate  these 
two  functions,  two  conditional  densities,  p(xj|^_1,t0  >  i  —  1)  and  p(xj|(VJ_1,  t0  € 
{j,  ■  ■  • ,  i  —  1})  should  be  found.  These  two  conditional  densities  can  be  calculated 
recursively  as  follows 

P(x»|iK_1,to  >  *  -  1) 

=  /p(xj|xj_i,  to  >  i  -  l)p(xi_i|yj-1,  to  >  i  -  l)dxj_i  (8.17) 

=  /p0(xi|xi_1)p(xi_i|3tj-1,  t0  >  i  -  l)rfxj_i  , 

where  po(xi|xj_i)  is  the  conditional  density  of  the  state  at  time  i  given  the  state 
at  time  %  —  1  assuming  that  no  change  has  happened  up  until  time  i  —  1.  The 


(8.15) 


(8.16) 
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recursion  is  complete  with 


p(xi~i\y[  \  t0>  i-1) 


p(xi-i  2,  to>i-l)p(yi-i|xi_i,  t0>i-l) 

J  p(x,:_i|yp2,  to>i-l)p(y.;-i|xi_i,  t0>i-l)a!x;_i 


(8.18) 


p(x,:~i|y^  2,  tp>i—2)p0  (yj_i  |xj_i) 
J'p(x,_i|yp2,  io>*— 2)p0 (y,_i |x,_i)dx,_i  ’ 


and  it  is  assumed  that  the  initial  density  of  the  state  is  known.  (8.17)  and  (8.18) 
are,  in  fact,  the  equations  for  the  nonlinear  filter  assuming  that  no  change  has 
happened.  For  the  other  conditional  density  we  have 


p(*i\y{  \  *o  e  O’,  ■■•,*-!}) 

=  /p(xi|xi_i,  t0  e  0,  to  e  0,  •••,*- l})dxi_! 

(8.19) 

=  -pi  /Pi(xi|xi_i)p(xi_i|^"1,  to  =  i  —  l)dxj_i  + 

/Pi(xi|xi_1)p(xj_1|^“1,  to  e  0,  •  •  •  ,i  -  2})dxi_1, 

where  pi(xj|x,j_i)  is  the  conditional  density  of  the  state  at  time  i  given  the  state 
at  time  i  —  1  assuming  that  a  change  has  occurred.  To  complete  the  recursion 
formula  we  have 


p(xj_i|^  1,  t0  =  i  —  1) 


pfxj-ipq  2,  t0=i-l)p(yi_i|xi_i,  t0=i-l) 

J  p(x,_i|yp“,  t 0 =i - 1  )p(y i - 1 1 Xj y|,  t0=t— l)dxj_:| 


(8.20) 


p(xi-i|y^  2,  tp>i—2)p1  (y,_i  |xj_i) 
J'p(x,_i|yp2,  to>i-2)p1(yi_i|xi_i)dxi_i  ’ 


and 


p(*i-i\y[  \  t0  g  o,  2}) 


p(xj_i|y,  2,  t0e{j,-,t-2})p(yi_i|xi_i,  to£{j,--,i-2}) 
f  p(xi_i|yj-2,  toe{i,--,i-2})p(yi_i|xi_i,  toe{j,-)*-2})<ixi_i 


(8.21) 


p(xj_i|y^  2,  toe{j,-,i-2})p1  (yi— i |x;_i) 

/  p(x,_i|yj-2,  t0e{j,  — ,i-2})p1(yi_i|xi_i)dxi_i  ■ 
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Figure  8.2:  Implementation  of  the  nonlinear  filters  used  in  the  change  detection 
algorithm  in  (8.12). 


Figure  8.2  shows  the  implementation  of  equations  (8.17)  through  (8.21);  it  can 
be  seen  that  the  complexity  of  the  implemented  nonlinear  filter  does  not  grow  with 
time.  Using  Figure  8.2  and  the  definition  of  Tj  we  have 

7?-§ta(rTM((*-‘)+f+(,-4))’  (8'22) 

where 

=  p{ y*|yr\  h  >  i ) 

=  p(yil^i_1)  t0  =  i ) 

=  p(yi\yi~\  t0e{j,--,i- 1)}- 
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8.5  Simulations  and  Results 


In  Chapter  5  we  showed  that  for  an  integrated  INS/GPS  when  the  number  of 
satellites  is  less  than  a  critical  number,  projection  particle  filtering  provides  a 
very  accurate  estimate  of  the  position  while  the  position  solution  given  by  EKF 
is  unacceptable.  In  this  section  we  use  the  same  example  to  apply  the  change 
detection  method  in  (8.12).  Similarly  to  Chapter  5,  for  a  critical  situation  (low 
number  of  observable  satellites)  the  linearization  methods  do  not  work,  particularly 
we  cannot  use  EKF.  On  the  other  hand,  the  CUSUM  algorithm  leads  to  a  growth  in 
computational  complexity  with  respect  to  time,  therefore,  at  this  point  a  natural 
selection  for  a  change  detection  algorithm  is  the  method  in  (8.12).  We  wish  to 
emphasize  that  in  the  example  given  in  this  section  we  assume  that  the  parameter 
of  change,  before  and  after  change,  is  known  and  the  only  unknown  parameter  is 
the  change  time.  In  future,  we  will  address  the  more  general  problem  of  unknown 
change  parameters. 

The  dynamics  of  an  integrated  INS/GPS  and  the  observation  equation  for 
differential  GPS  are  given  in  Chapter  5.  The  only  difference  is  that  we  assume 
that  the  signal  associated  to  one  of  the  satellites  experiences  an  abrupt  change,  i.e. 
we  assume  a  known  cycle  slip  in  one  of  the  channels. 

For  this  simulation  we  simply  chose  an  11  dimensional  Gaussian  density  for 
the  projection  particle  filtering.  This  choice  of  density  makes  the  random  vector 
generation  easy  and  computationally  affordable.  To  be  able  to  use  the  projection 
particle  filtering,  we  used  maximum  likelihood  to  estimate  the  parameters  of  the 
Guassian  density  before  and  after  Bayes’  correction. 

Using  the  data  described  in  Section  5.2.3,  we  generated  the  pseudo  range  and 
carrier  phase  data  for  one  static  and  one  moving  receiver  in  the  same  way  as  was 
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Tk  change  detection  statistic 


Figure  8.3:  This  figure  shows  the  plot  of  Tj  with  respect  to  time.  At  time  t  =  15, 
the  receiver  loses  3  satellites.  We  assume  that  the  cycle  slip  in  channel  one  occurred 
at  time  t  —  20. 


performed  in  Section  5.2.3.  Here  we  assume  that  for  the  carrier  phase  measurement 
the  integer  ambiguity  problem  is  already  solved.  We  also  assumed  that  the  phase 
lock  loop  associated  to  satellite  1  experiences  a  cycle  slip  and  the  phase  suddenly 
changes.  The  size  of  the  change  is  assumed  to  be  one  cycle.  The  movement  of  the 
INS/GPS  platform  was  simulation  based  and  the  measurement  data  measured  by 
the  accelerometers,  the  gyros,  the  GPS  pseudo  range,  and  the  GPS  carrier  phase 
data  were  generated  according  to  that  movement. 

In  the  simulation  we  assumed  that  the  GPS  receiver  starts  with  6  satellites. 
At  time  t  =  15,  the  receiver  loses  3  satellites.  We  assume  that  the  cycle  slip  in 
channel  one  occurred  at  time  t  =  20.  In  Figure  8.3  we  have  plotted  Tj  with  respect 
to  time.  In  the  figure  this  sudden  change  is  indicated  by  a  sudden  change  in  the 
value  of  . 
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Chapter  9 


Conclusions  and  Future  Work 


In  this  thesis  we  studied  filtering,  estimation,  and  detection  for  stochastic  systems 
with  nonlinear  dynamics  and  nonlinear  observations. 

We  presented  a  new  approximate  nonlinear  filtering  method  for  a  class  of  sys¬ 
tems  whose  conditional  density  lies  in  a  certain  family  of  exponential  densities.  We 
showed  that  under  the  conditions  stated  in  Chapter  4  the  approximate  conditional 
density  can  be  made  arbitrarily  close  to  the  true  conditional  density  in  the  sense 
described  in  the  same  Chapter.  We  also  proved  that  when  the  true  conditional 
density  does  not  lie  in  an  exponential  family  but  it  is  close  to  it,  the  error  of  the 
estimate  given  by  projection  particle  filtering  is  bounded.  Using  the  results  in 
Chapter  4,  we  presented  a  similar  method  for  a  family  of  mixture  densities. 

We  showed  that  for  an  integrated  INS/GPS  when  the  number  of  visible  satel¬ 
lites  is  below  a  critical  number  the  extended  Kalman  filter  fails  to  provide  an 
acceptable  estimate  of  the  position.  We  showed  that  under  the  same  conditions 
nonlinear  filtering  methods  are  capable  of  providing  an  accurate  estimate  of  the 
position.  Via  numerical  results,  we  also  showed  that  the  performance  of  the  pro¬ 
jection  particle  filter  exceeds  the  regular  particle  filter. 
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We  applied  a  method  similar  but  different  from  projection  particle  filtering  to 
the  integer  ambiguity  resolution  for  carrier  phase  differential  GPS.  In  this  method, 
we  assumed  that  the  integer  uncertainty  in  the  carrier  phase  measurement  is  an  in¬ 
teger  random  vector.  We  presented  an  algorithm  that  approximates  the  conditional 
density  of  the  integer  uncertainty  given  the  observation.  The  numerical  results  re¬ 
ported  in  Chapter  7  indicate  a  very  low  percentage  of  error  for  this  method. 

Another  problem  that  we  addressed  in  this  dissertation  is  the  problem  of  detec¬ 
tion  of  abrupt  changes  with  known  parameters  after  change  for  nonlinear  stochastic 
systems.  We  showed  that  applying  the  CUSUM  algorithm  for  such  systems  results 
in  a  growth  in  computational  complexity  with  respect  to  time.  To  avoid  this  prob¬ 
lem,  we  introduced  a  new  statistic  that  can  be  used  for  change  detection  methods 
based  on  a  likelihood  ratio  test.  We  showed  that  the  calculation  of  this  statistic 
can  be  done  recursively  with  fixed  computational  complexity  with  respect  to  time. 

The  work  we  presented  in  this  thesis  may  be  extended  towards  several  direc¬ 
tions.  In  the  following,  we  try  to  give  a  brief  description  of  these  directions. 

Numerical  results  in  Chapter  7  suggest  that  Algorithm  7.2.1  is  a  good  candidate 
for  an  integer  ambiguity  resolution  method.  Although  in  that  chapter  we  presented 
an  argument  to  justify  this  fact,  we  were  not  able  to  prove  that  Algorithm  7.2.1 
indeed  provides  an  approximate  solution  for  the  conditional  density  of  the  integer 
uncertainty  given  the  carrier  phase  observations.  One  direction  for  the  extension 
of  results  of  this  dissertation  would  be  the  investigation  of  this  matter. 

From  a  practical  point  of  view,  we  think  that  a  very  natural  extension  of  this 
research  would  be  to  implement  the  projection  particle  filter  for  an  integrated 
INS/GPS  in  real  time.  For  this  purpose  one  can  use  an  affordable  inertial  naviga¬ 
tion  system  and  a  hand  held  GPS  receiver  that  can  be  connected  to  a  computer. 
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We  believe  the  next  generations  of  cellular  phones  will  have  an  inexpensive  built- 
in  integrated  INS/GPS.  One  of  the  challenges  in  the  next  few  years  would  be  to 
design  such  systems  with  acceptable  reliability  and  accuracy. 

The  abrupt  change  detection  method  that  we  studied  in  Chapter  8  is  limited 
to  change  detection  where  the  parameters  after  change  are  assumed  to  be  known. 
In  future,  we  intend  to  extend  our  results  to  the  case  where  the  parameters  after 
change  are  unknown.  The  major  obstacle  in  this  extension  is  the  complexity  of  the 
change  detection  method.  Another  subject  that  requires  further  investigations  is 
the  comparison  of  the  presented  method  with  other  existing  methods. 
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